Fast Gi Anyone?

MADjestic · July 26, 2005

Congrats, Hoknamahn. No matter how hard I tried - you were the first to get the final result. =)

Anyway, now it's turn to optimize the code. It's gonna be fun, I'm sure.

hoknamahn · July 26, 2005

Madjestic, don't forget about bent normals and bidirectional surfaces!

MADjestic · July 26, 2005

A good point, hoknamahn! This is one of the "to be done" things. It's so cool to dig all this stuff, too bad one has also think about "ordinary things", like making money for living, paying the rent and all

btw, has anybody got a nice document or any info about the occtree stuff? How does it work, the basic principles, algorithm description?

Jason · July 26, 2005

Last results

19872[/snapback]

Very very nice! Well done, hok!

sibarrick · July 26, 2005

A good point, hoknamahn! This is one of the "to be done" things. It's so cool to dig all this stuff, too bad one has also think about "ordinary things", like making money for living, paying the rent and all
btw, has anybody got a nice document or any info about the occtree stuff? How does it work, the basic principles, algorithm description?

19898[/snapback]

Do you mean octree in general or in hdk?

Google turns up loads of good stuff on octrees and kd-trees (which I think is what Houdini uses)

As for the HDK I don't think there are any nice documents on any of it anywhere.

stevenong · July 26, 2005

Well done hoknamahn!!!

hoknamahn · July 26, 2005

Thanks guys! All this owing to excellent support of Houdini Community

MADjestic · July 26, 2005

btw, Mario,

Have you used HDK kd trees for your code optimisation(i.e. boosting the speed)? Or was it something you developed from 0? Can you give us a hint what should we be looking into?

Mario Marengo · July 27, 2005

Have you used HDK kd trees for your code optimisation(i.e. boosting the speed)?

No. As I mentioned in an earlier post, I used it to help determine how the elements should be grouped together, but that relates to some choices I made, and is not some kind of prerequisite for getting it to work. Not in the least. Please see earlier in this thread.

Can you give us a hint what should we be looking into?

Read the nVidia paper

No. Really. It explains the tree structure they use very nicely -- they even draw you a little diagram of how the nodes connect to each other, so that lateral traversal through nodes at the same level warps you up one level at the end of each sub-list. It's somewhat closer to an octree or an Nary-tree than a kd-tree, but actually none of those... I've been calling it a MipMap because I think it resembles that the most, although it's not exactly that either. (it's designed for fast traversal of a specific kind, not for partitioning space).

Long story short: it's tailored to the needs of this particular application, and so I think you're probably better off bulding it yourself.

Perhaps you can force UT_Tree<T>, or UT_VoxelMipMap, or UT_NTree to work for you, but it'd probably take longer to learn those classes (then apply them correctly) than just writing one yourself.

Hope that helps.

MADjestic · July 27, 2005

Thanks Mario,

Somehow I think I know what you are talking about. I hope what you are doing is not in vain and soon you'll see the first optimisation results.

Thanks alot.

MADjestic · August 2, 2005

That's just an average of the two passes, except that instead of using a straight average (which would be 0.5*pass1 + 0.5*pass2 or simply (pass1+pass2)/2) they use a weighted average where pass2 is given slightly more importance than the first pass -- 10% more weight to be precise. So their expression is: 0.6*pass2 + 0.4*pass1. i.e: "60% weight to the current pass, and 40% weight to the previous pass, but only when current_pass>pass1".
That's looking good, Hoknamahn!

19861[/snapback]

I think I almost get it, but here is a few things I don't understand:

What does the second pass exactly do? Is it that we simply calculate the occlusion (the total ammount of shadow per point) once again? Or do we add the second pass to the first pass? But isn't the occlusion per point the same as first pass?

Even if we sum the up - what's the sense of making 2 identical passes? Shouldn't we just take the result of the first pass and do these x0.4+x0.6 calculations?

Mario Marengo · August 3, 2005

What does the second pass exactly do? Is it that we simply calculate the occlusion (the total ammount of shadow per point) once again? Or do we add the second pass to the first pass? But isn't the occlusion per point the same as first pass?
Even if we sum the up - what's the sense of making 2 identical passes? Shouldn't we just take the result of the first pass and do these x0.4+x0.6 calculations?

No. The passes are not redundant.

The occlusion results for each element are not the same (in general) from pass to pass.

The contribution from a given element during any given pass is modulated by the complement of the occlusion result that was stored in that element from a previous pass. This is separate from the weighted average used to combine two passes at the end of each pass. The rationale being that an element that is itself in shadow shouldn't emit more shadow than the portion of itself that is *not* already in shadow (keeping in mind that all of this is an approximation).

Read section 14.2.1 carefully again, and you'll see what I mean. Also, take another look at figure 14-4 which gives you a graphical representation of the problem that the multiple passes are intended to solve (although, admittedly, it is not a "perfect" solution -- but neither is it given as one. It's simply one way to reduce errors that result from the way in which that particular algorithm, as given, works).

Hope that helps.

MADjestic · August 3, 2005

an element that is itself in shadow shouldn't emit more shadow than the portion of itself that is *not* already in shadow
Hope that helps.

20117[/snapback]

Thanks Mario, I am already ery close and my optimization works fine (boosting the coocking speed by approximately 4 times and more). Still here are a few things that confuse me:

"we use a second pass to do the same calculation, but this time we multiply each form factor by the emitter element's accessibility from the last pass"

Does it mean that we have to compute every emmiter element for avery other emmiter? To calculate how much our emmiter is shadowed by other emmiters? Is that right? And if it is - does it mean that in fact we get a 3rd pass (to calculate this "accessibility value")?

Thanks.

Mario Marengo · August 3, 2005

Does it mean that we have to compute every emmiter element for avery other emmiter?

Uhmmm.... the way you worded that is a little confusing, but I think the answer is "yes" (though I confess I'm not exactly sure what you mean).

And if it is - does it mean that in fact we get a 3rd pass (to calculate this "accessibility value")?

No. There's no need for any extra "initialization" passes.

How did you come to that conclusion again?

Just to be clear:

*) What they call "accessibility" is the complement of what we call "occlusion", so:

accessibility = 1 - occlusion;

*) There is only *one* set of elements.

As we loop through all elements (in this *single* set), the current element being visited plays the role of receiver, while the rest are considered emitters. This repeats for all elements so that, in the end, they have all played both roles (emitter and receiver).

I'm probably wrong, but for some reason I get the weird feeling that you think there are more than one set of elements, so I thought I'd make that part clear just in case.

Cheers!

hoknamahn · August 3, 2005

What dependence between a receiver-emitter distance and depth of a tree/mip-map level, Mario?

I saw such piece of code in NVidia's shader

if (d2 &lt; -emitterArea*4) {  // parents have negative area
   emitterIndex.xy = emitterIndex.zw;  // use alternate index (go down hierarchy)
   emitterArea = 0;      // ignore this element
}

Do you use something like that?

Mario Marengo · August 3, 2005

What dependence between a receiver-emitter distance and depth of a tree/mip-map level, Mario?
I saw such piece of code in NVidia's shader
if (d2 &lt; -emitterArea*4) {  // parents have negative area
   emitterIndex.xy = emitterIndex.zw;  // use alternate index (go down hierarchy)
   emitterArea = 0;      // ignore this element
}

In words, it says "If the distance between emitter and receiver squared is less than four times the emitter's area, then go to the next, more refined level".

Note that the factor of 4 is arbitrary (it's just a safe choice to get started) and, while it may be OK for a lot of scenes, it could also be too large (for large elements within a small volume) or too small (objects that are very large in one dimension but very small in the other two). So you may find it advantageous to spend some time trying to hallucinate something a little more robust.

hoknamahn · August 3, 2005

In words, it says "If the distance between emitter and receiver squared is less than four times the emitter's area, then go to the next, more refined level".

Yes, I understand that condition except for a sign "-". Why we need it?

And what maximal depth of tree have sense?

Mario Marengo · August 3, 2005

Yes, I understand that condition except for a sign "-". Why we need it?

I believe they store the area of non-leaf nodes as negative area. It's just a quick way to tell leaf and parent nodes appart, and likely a side effect of using texture maps for the structure.

So, no, you don't need it.

And what maximal depth of tree have sense?

Hard to say, since there are so many ways to go with that -- and most of the choices depend on how you populate the tree, so... I'll say "whatever you think is best". :unsure:

MADjestic · August 4, 2005

Ok, here are my last results.

Currently I haven't implemented the second pass yet, as I've been strugling at getting a decent looking 1st pass + algorithm optimisation with KD-tree like approach.

I haven't played too much with my tree's parameters - so I kept average ones (in terms of speed).

System: Athlon1000, 768 RAM, Windows XP Home edition

Here I culled the back faces of the rabbit:

number of points of the scene: 17925

"kd-tree" generation time: 0.70 sec

cooking time: 3 min 55 sec

rendering time: 6 sec

=============================================

Here is the full rabit:

number of points of the scene: 36669

"kd-tree" generation time: 1.52 sec

cooking time: 14 min 25 sec (around 40 minutes without optimization)

rendering time: 9.02 sec

Later I'll add the second pass and will play some more with "kd-tree" parameters.

Thanks everybody who helped so much (Mario, hoknamahn and others).

Jason · August 4, 2005

Looking good, MADjestic!

Sign In

Fast Gi Anyone?

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Posted Images

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation