Jump to content
yourdaftpunk

NAN values and Arnold - a quick fix if this plagues your renders

Recommended Posts

A problem I've now had twice in production is NAN valued points in hi-res meshes generated from vdbs. Out of maybe 10 million point meshes per frame across 100 frames I'll get one or two meshes with 200ish bad primitives (prims with garbage points, so far NANs, but I bet infinite values wouldn't play nice either). These meshes look fine and render fine in Houdini and the bgeo's are the same size as the other frames. If you finish in Houdini, you'll never know there is a problem. The bad prims are tiny and just not drawn as they have no position.

 

However, caching out .ass files results in silently truncated frames. There are no errors when writing out these .ass files from the in house tool. The corrupt frame gets rendered in Maya and no mesh shows up or it throws and error and I get an email about issues with my mesh which, again, look fine in the viewport. Files are about half the size they should be so truncation is clearly happening.

 

If this sounds like something you're experiencing, consult the spreadsheet and sort your position values by any channel and if you have not-a-number values it'll look like this:

 

post-926-0-88361200-1439841436_thumb.png
 

If you're familiar with vex, you may know there is a NAN testing function called isnan() which will work on a float so we can test against each position channel and if any fail the test we have discovered our bad geo. A word of advice though- don't just delete the bad point, delete the bad primitive! For whatever reason, Arnold doesn't always find the geo well formed when culled by points in Houdini. You'll often get a vertex mismatch error during rendering and another email. So the safe solution is to walk over primitives in a wrangle and test the points of the current primitive. If any fail the NAN test, delete the entire prim:

// PrimitiveWrangle code
int cull_prim = 0;
int pt;
vector pos;

for (int i=0; i<primvertexcount(0, @primnum); i++) {

    // convert the prim vertex -> linear vertex -> point number
    pt = vertexpoint(0, vertexindex(0, @primnum, i));

    pos = point(0, "P", pt);
    if (isnan(pos.x) || isnan(pos.y) || isnan(pos.z)) {
        cull_prim = 1;
        break;
    }
}

if (cull_prim)
    removeprim(0, @primnum, 1);

If anyone has a better solution, I'd love to hear it. I'm a bit disappointed by how brittle the Arnold/.ass combination is compared to Houdini's handling of floating point values. Vertex mismatches and NAN errors have wasted too much of my time. On the other-hand, Arnold renders look great! 

 

Cheers

Shawn

Edited by yourdaftpunk
  • Like 1

Share this post


Link to post
Share on other sites

one of the possible solutions could be submitting a BUG maybe? sometimes that works :)

as nobody likes NANs

Share this post


Link to post
Share on other sites

At least you see them while still being in SOPs! Nothing like unexpected buckets of Nans after nightly renders... on deadline day...

  • Like 1

Share this post


Link to post
Share on other sites

Anim, I do intend too. I see it as three bugs to report:

 

1) Arnold's handling of NANs.

2) Arnold's handling of certain topologies without NANs which seem valid in Houdini (the point delete issue I mentioned).

3) Houdini's/VDB's issues with certain particle to vdb operations and/or sdf smoothing operations.

 

For the third bug, if I can find the time I need to:

 

1) Remove custom otls and slim down the network to the problem area.

2) Transfer the 1.2GB particle cache frame going into the mesh nodes so they can diagnose the issue.

3) Write up some additional observations which I think will help.

 

It's my last week on the job so I would put all this well bellow finishing :) I'm also curious what would happen if I wrote that mesh out as an alembic or obj. Would it be loadable in other apps? Would Houdini gracefully bring it back in?

 

 

I hope this post helps some future TD banging her head against a monitor. NANs are part of the floating point specification along with INF values and software needs to properly take this into consideration. Much like non-manifold geometry, this stuff will crop up from time to time, or it will come into houdini through the external pipeline. I remember educating compositiors about the issue years ago when they first started moving to Nuke 6 / exr-half and they couldn't understand why some renders had black pixels which couldn't be easily fixed and grossly contaminated neighboring areas when blurred (protip- Houdini has a builtin cop node called illegalPixel with a cute icon for handling this). The solution then was a simple expression much like the vex code above.

Share this post


Link to post
Share on other sites

Symek, you're right, I'm lucky to tackle this in sops. Have you seen whole buckets poisoned with NANs in mantra before? I remember mental ray doing that, but so far mantra has been kind to me. Arnold too.

Share this post


Link to post
Share on other sites

I noticed when there are NANs, if you render with motionblur on, mantra will stuck and never finish.

  • Like 1

Share this post


Link to post
Share on other sites

NANs are never good, Mantra is not completely bulletproof either, for example SSS breaks with NANs in normals (usually looks like tons of super bright speckles all over) and I bet you can find a lot of other cases

so it's smarter prevent or get rid of them even for Mantra

Share this post


Link to post
Share on other sites

Very annoying indeed.

The Clean SOP has this Built In if you want a quick fix.

For scenes with complex (even changing) geometry and Rendering with HtoA it seems to be good practice to add the clean SOP with "Remove NANs" and "Manifold Topology Only".

Internally uses a similar vexpression:

isnan(@P.x) || isnan(@P.y) || isnan(@P.z)

 

Share this post


Link to post
Share on other sites

For better future finding of this thread, this is the type of Error Message you mostly get from Arnold when this happens:

 

* CRASHED in AiIsFinite at 00:00:11, pixel (1952, 664)
* signal caught: error C0000005 -- access violation
*
* backtrace:
*  0 0x00007ff9f47b61ce [ai        ]
*  1 0x00007ff9f47b546f [ai        ]
*  2 0x00007ffa32f2f67a [KERNELBASE] UnhandledExceptionFilter
*  3 0x00007ffa35f44af2 [ntdll     ] memset                  03:16:25  9997MB WARNING |   [kick] render aborted due to earlier errors

*  4 0x00007ffa35f2c6d6 [ntdll     ] _C_specific_handler
*  5 0x00007ffa35f411ff [ntdll     ] _chkstk
*  6 0x00007ffa35f0a289 [ntdll     ] RtlRaiseException
*  7 0x00007ffa35f3fe6e [ntdll     ] KiUserExceptionDispatcher
>> 8 0x00007ff9f4da35cb [ai        ] AiIsFinite
*  9 0x00007ff9f4da2e48 [ai        ] AiIsFinite
* 10 0x00007ff9f4da18a2 [ai        ] AiIsFinite
* 11 0x00007ff9f4481bc4 [ai        ] AiUniverseGetSceneBounds
* 12 0x00007ff9f48071bc [ai        ] AiTextureParamsSetDefaults
* 13 0x00007ff9f4770277 [ai        ] AiUniverseGetAOVIterator
* 14 0x00007ff9f476f28d [ai        ] AiUniverseGetAOVIterator
* 15 0x00007ff9f4f83f5d [ai        ] AiIsFinite
* 16 0x00007ff9f4809af0 [ai        ] AiLightsTrace
* 17 0x00007ff9f48113a9 [ai        ] AiTrace
* 18 0x00007ff9f4f982c1 [ai        ] AiIsFinite
* 19 0x00007ff9f4766923 [ai        ] AiUniverseGetAOVIterator
* 20 0x00007ff9f4f6662a [ai        ] AiIsFinite
* 21 0x00007ff9f476f358 [ai        ] AiUniverseGetAOVIterator
* 22 0x00007ff9f4f83f5d [ai        ] AiIsFinite
* 23 0x00007ff9f4809af0 [ai        ] AiLightsTrace
* 24 0x00007ff9f48113a9 [ai        ] AiTrace
* 25 0x00007ff9f4f982c1 [ai        ] AiIsFinite

Share this post


Link to post
Share on other sites

In some cases though, for some random frames, both the clean SOP and @yourdaftpunk 's Vex Wrangle did not help.

Here I had to additionally Subdivide the Geo at Rendertime with the Arnold Subdivision OBJ Level Parms. (Even though I actually would not have wanted to subdivide them)

But then it rendered fine. Well yeah, Arnold seems to veery picky about this stuff

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×