Jump to content

OpenCL Smoke - any OpenCLified way to source?


danw

Recommended Posts

I'm trying to optimize some smoke sims to maybe run on OpenCL.

Working from this thread's example scene for how best to create a basic setup:

http://www.sidefx.com/index.php?option=com_forum&Itemid=172&page=viewtopic&t=25234

 - and subsequently checking in the performance monitor, I can see that for a scene where I need to emit density, temperature, and velocity on every frame (rather than just frame 1, as it recommends), the vast majority of sim time is spent sourcing, and a tiny minority running the OpenCL part.

I gather it's not just the time it takes to actually process the source-volume node, but the time it takes to load the density, temperate and vel.x/y/z fields off the GPU, modify them, and then load them all back on again... does that sound correct?

 

Is there any way whatsoever to "emit" directly on the GPU?  Maybe load static source volumes onto the GPU one time only, and perform the "add" function using an OpenCL-friendly function each frame, or something to that effect?

Link to comment
Share on other sites

  • 1 month later...

Following this up - I worked out a couple of creative ways to "emit" directly on the GPU.  These depend entirely on the level of animation you want on an emitter, or hopefully the lack of it.

 

- The general gist is - use a "Gas Match Field" to duplicate the "density" field, call it something like "densityemit".

- Use "Source Volume" as usual, but source it into densityemit rather than density.

- Downstream of the Source Volume node, throw in an "Intermittent Solve" node and set it to "Only Once" if you only want a static emitter that emits in the same place each frame.

   -(If you have an animated emitter, but it only moves slowly, you can set it to only refresh the emission field on the GPU ever 4th frame, or something like that... or if you're running a fast sim with lots of substeps, refresh the source only once every frame, rather than 20 times per frame.)

- Finally, use a "Gas Linear Combination" node, with OpenCL ticked on, and set it up to do a Maximum (or Add) density and densityemit - into density as the target field.

The result will be the GPU will repeatedly splat the density emitter each frame without needing to load new data onto the GPU each time.

 

The trade-off is that the densityemit field will take up another chunk of GPU memory, so you ultimately reduce the maximum resolution you can fit on the card, for an increase in speed.  This gets especially costly if you want to store velocity emitters on the card.  It pays to be creative and get as much mileage out of one stored density emitter as possible - scaling it up and using it to emit temperature, rather than storing a separate temperatureemit volume, that sort of thing.  Could even potentially work for velocity, but I can't work out how to emit to vector field components individually using only Gas Linear Combination nodes.

 

I get the feeling there are plenty more tricks to wring the most out of OpenCL sims, speed and memory-usage wise.  These tricks work great even for the Intel OpenCL CPU driver - I've had some 200-300 megavoxel smoke sims running on CPU very fast using this appoach - no more than 1-2 minutes a frame.

Edited by danw
  • Like 1
Link to comment
Share on other sites

I got an 8GB Radeon R9 290X running, and confirmed that it does indeed work with 64-bit addressing.  I had a whole bunch of trouble getting it functioning correctly under Linux Mint though... it seems Radeons are still a pain in the ass under Linux.  Ended up running an older driver that supported OpenCL 1.2 rather than 2.0 - and that at least worked without crashing Houdini, but was also the only driver I found that seemed incapable of setting my older Radeon as the primary display adapter and leaving the R9 as a dedicated OpenCL card.

 

Then after it functioned well a couple of times on that particular farm machine, it started inexplicably failing all OpenCL sims regardless... at which point I was neck-deep in a project, so I gave up and switched it all over to Intel OpenCL driver instead.  I'll dig back into it once I have some free time, as I now have a brand new ~£300 Radeon sitting idle, and I either need to get to the bottom of this, or sell the damned thing :-)

 

 

In summary, it certainly works fine having one card dedicated and another for display, provided your drivers aren't Linuxing all over the place... I suspect the mileage might be better under Windows.

 

Also nVidia recently FINALLY updated their drivers to OpenCL 1.2 as well, and FINALLY support 64-bit addressing too... so those 12GB Titan Xs suddenly look mighty appealing(er) :-)  (Not sure I can really justify a £900 card though!  Where the hell are the 8GB GTX 980s?  They already have 8GB 980Ms in laptops!!)

Link to comment
Share on other sites

Guest tar

Oh, Nivida has a bug in their newest driver, it says 64bit but can't address more than 4GB currently. Sesi have sent in a bug report to Nvidia.,

Link to comment
Share on other sites

Hah, I wondered... I thought it might allow me to address the last ~500MB on my 4GB card, as the 32-bit addressing seems to top out around 3.5 for me... but I got identical results.  Well, definitely not time to impulse-buy any Titans yet then :-)

Link to comment
Share on other sites

Well, thanks for the heads up, it's great information to know *before* getting bitten in the ass by it!

 

If you hear of any positive developments, let the forum know :-) (I'll do the same if I spot it)

My guess is, that might be in as soon as, ooh, 6-12 months from now! :-P

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...