Jump to content

Fast Gi Anyone?


Recommended Posts

  • Replies 137
  • Created
  • Last Reply

Top Posters In This Topic

A good point, hoknamahn! This is one of the "to be done" things. It's so cool to dig all this stuff, too bad one has also think about "ordinary things", like making money for living, paying the rent and all :D

btw, has anybody got a nice document or any info about the occtree stuff? How does it work, the basic principles, algorithm description?

Link to comment
Share on other sites

A good point, hoknamahn!  This is one of the "to be done" things.  It's so cool to dig all this stuff, too bad one has also think about "ordinary things", like making money for living, paying the rent and all  :D

btw, has anybody got a nice document or any info about the occtree stuff?  How does it work, the basic principles, algorithm description?

19898[/snapback]

Do you mean octree in general or in hdk?

Google turns up loads of good stuff on octrees and kd-trees (which I think is what Houdini uses)

As for the HDK I don't think there are any nice documents on any of it anywhere.

Link to comment
Share on other sites

Have you used HDK kd trees for your code optimisation(i.e. boosting the speed)? 

No. As I mentioned in an earlier post, I used it to help determine how the elements should be grouped together, but that relates to some choices I made, and is not some kind of prerequisite for getting it to work. Not in the least. Please see earlier in this thread.

Can you give us a hint what should we be looking into?

Read the nVidia paper :)

No. Really. It explains the tree structure they use very nicely -- they even draw you a little diagram of how the nodes connect to each other, so that lateral traversal through nodes at the same level warps you up one level at the end of each sub-list. It's somewhat closer to an octree or an Nary-tree than a kd-tree, but actually none of those... I've been calling it a MipMap because I think it resembles that the most, although it's not exactly that either. (it's designed for fast traversal of a specific kind, not for partitioning space).

Long story short: it's tailored to the needs of this particular application, and so I think you're probably better off bulding it yourself.

Perhaps you can force UT_Tree<T>, or UT_VoxelMipMap, or UT_NTree to work for you, but it'd probably take longer to learn those classes (then apply them correctly) than just writing one yourself.

Hope that helps.

Link to comment
Share on other sites

That's just an average of the two passes, except that instead of using a straight average (which would be 0.5*pass1 + 0.5*pass2 or simply (pass1+pass2)/2) they use a weighted average where pass2 is given slightly more importance than the first pass -- 10% more weight to be precise. So their expression is: 0.6*pass2 + 0.4*pass1. i.e: "60% weight to the current pass, and 40% weight to the previous pass, but only when current_pass>pass1".

That's looking good, Hoknamahn! :)

19861[/snapback]

I think I almost get it, but here is a few things I don't understand:

What does the second pass exactly do? Is it that we simply calculate the occlusion (the total ammount of shadow per point) once again? Or do we add the second pass to the first pass? But isn't the occlusion per point the same as first pass?

Even if we sum the up - what's the sense of making 2 identical passes? Shouldn't we just take the result of the first pass and do these x0.4+x0.6 calculations?

Link to comment
Share on other sites

What does the second pass exactly do?  Is it that we simply calculate the occlusion (the total ammount of shadow per point) once again?  Or do we add the second pass to the first pass?  But isn't the occlusion per point the same as first pass? 

Even if we sum the up - what's the sense of making 2 identical passes?  Shouldn't we just take the result of the first pass and do these x0.4+x0.6 calculations?

No. The passes are not redundant.

The occlusion results for each element are not the same (in general) from pass to pass.

The contribution from a given element during any given pass is modulated by the complement of the occlusion result that was stored in that element from a previous pass. This is separate from the weighted average used to combine two passes at the end of each pass. The rationale being that an element that is itself in shadow shouldn't emit more shadow than the portion of itself that is *not* already in shadow (keeping in mind that all of this is an approximation).

Read section 14.2.1 carefully again, and you'll see what I mean. Also, take another look at figure 14-4 which gives you a graphical representation of the problem that the multiple passes are intended to solve (although, admittedly, it is not a "perfect" solution -- but neither is it given as one. It's simply one way to reduce errors that result from the way in which that particular algorithm, as given, works).

Hope that helps.

Link to comment
Share on other sites

an element that is itself in shadow shouldn't emit more shadow than the portion of itself that is *not* already in shadow

Hope that helps.

20117[/snapback]

Thanks Mario, I am already ery close and my optimization works fine (boosting the coocking speed by approximately 4 times and more). Still here are a few things that confuse me:

"we use a second pass to do the same calculation, but this time we multiply each form factor by the emitter element's accessibility from the last pass"

Does it mean that we have to compute every emmiter element for avery other emmiter? To calculate how much our emmiter is shadowed by other emmiters? Is that right? And if it is - does it mean that in fact we get a 3rd pass (to calculate this "accessibility value")?

Thanks.

Link to comment
Share on other sites

Does it mean that we have to compute every emmiter element for avery other emmiter?

Uhmmm.... the way you worded that is a little confusing, but I think the answer is "yes" (though I confess I'm not exactly sure what you mean).

And if it is - does it mean that in fact we get a 3rd pass (to calculate this "accessibility value")?

No. There's no need for any extra "initialization" passes.

How did you come to that conclusion again?

Just to be clear:

*) What they call "accessibility" is the complement of what we call "occlusion", so:

accessibility = 1 - occlusion;

*) There is only *one* set of elements.

As we loop through all elements (in this *single* set), the current element being visited plays the role of receiver, while the rest are considered emitters. This repeats for all elements so that, in the end, they have all played both roles (emitter and receiver).

I'm probably wrong, but for some reason I get the weird feeling that you think there are more than one set of elements, so I thought I'd make that part clear just in case.

Cheers!

Link to comment
Share on other sites

What dependence between a receiver-emitter distance and depth of a tree/mip-map level, Mario?

I saw such piece of code in NVidia's shader

if (d2 &lt; -emitterArea*4) {  // parents have negative area
   emitterIndex.xy = emitterIndex.zw;  // use alternate index (go down hierarchy)
   emitterArea = 0;      // ignore this element
}

Do you use something like that?

Link to comment
Share on other sites

What dependence between a receiver-emitter distance and depth of a tree/mip-map level, Mario?

I saw such piece of code in NVidia's shader

if (d2 &lt; -emitterArea*4) {  // parents have negative area
   emitterIndex.xy = emitterIndex.zw;  // use alternate index (go down hierarchy)
   emitterArea = 0;      // ignore this element
}

In words, it says "If the distance between emitter and receiver squared is less than four times the emitter's area, then go to the next, more refined level".

Note that the factor of 4 is arbitrary (it's just a safe choice to get started) and, while it may be OK for a lot of scenes, it could also be too large (for large elements within a small volume) or too small (objects that are very large in one dimension but very small in the other two). So you may find it advantageous to spend some time trying to hallucinate something a little more robust. ;)

Link to comment
Share on other sites

In words, it says "If the distance between emitter and receiver squared is less than four times the emitter's area, then go to the next, more refined level".

Yes, I understand that condition except for a sign "-". Why we need it?

And what maximal depth of tree have sense?

Link to comment
Share on other sites

Yes, I understand that condition except for a sign "-". Why we need it?

I believe they store the area of non-leaf nodes as negative area. It's just a quick way to tell leaf and parent nodes appart, and likely a side effect of using texture maps for the structure.

So, no, you don't need it.

And what maximal depth of tree have sense?

Hard to say, since there are so many ways to go with that -- and most of the choices depend on how you populate the tree, so... I'll say "whatever you think is best". :unsure:

Link to comment
Share on other sites

Ok, here are my last results.

Currently I haven't implemented the second pass yet, as I've been strugling at getting a decent looking 1st pass + algorithm optimisation with KD-tree like approach.

I haven't played too much with my tree's parameters - so I kept average ones (in terms of speed).

System: Athlon1000, 768 RAM, Windows XP Home edition

Here I culled the back faces of the rabbit:

number of points of the scene: 17925

"kd-tree" generation time: 0.70 sec

cooking time: 3 min 55 sec

rendering time: 6 sec

bunny5fo.th.jpg

=============================================

Here is the full rabit:

number of points of the scene: 36669

"kd-tree" generation time: 1.52 sec

cooking time: 14 min 25 sec (around 40 minutes without optimization)

rendering time: 9.02 sec

bunny27nr.th.jpg

Later I'll add the second pass and will play some more with "kd-tree" parameters.

Thanks everybody who helped so much (Mario, hoknamahn and others).

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...