Jump to content

Volume Convolution: VEX & C++ vs OpenCL


animatrix

Recommended Posts

Hi Yunus,

Just a minor point, for your VolumeWrangle approach you can just do:

sum += volumeindex(0, "density", set(@ix-1, @iy, @iz));

which will skip the position calc and linear interpolation and be more similar to the OpenCL code (though still way slower!)

 

  • Like 1
Link to comment
Share on other sites

Also "faster than C++" is a bit misleading. You don't actually bench C++ code versus OpenCL code, but VolumeBlurSOP versus your OCL implementation of convolution (which is impressive btw, but still it's NOT proper C++ vs. OpenCL comparision :)

Link to comment
Share on other sites

I disagree. Volume Blur SOP is written in C++ so it's comparing a C++ implementation of a volume convolution by SESI none the less to a possible OpenCL version.

From SESI wrt Volume Convolve SOP:

"There pretty much is zero cost for the voxels with a 0 multiplier in them.  Because volume convolve 3x3x3 has a known stencil we can stream perfectly, avoiding any random access.  VEX will always be slower than it."

Edited by pusat
Link to comment
Share on other sites

Quote

Not really(...)

Well. really. Sorry, it's not very important after all, but it's like ABC of bench-marking and comparison studies. It's simply not technically possible that the some algorithm expressed in OpenCL is 650 times faster than its implementation in C++ - on the same hardware. If algorithm differs, or hardware differs, or one is multi-threaded and second is not, you are not entitled to say "OpenCL is X times faster then C++". Because it's not, something else does also matter, like hardware, or implementation details. Plain and simple. You would rather say, my OpenCL code is 650 faster than Houdini's own VolumeBlurSOP. Peroid. Which is still great result, but refers rather to VolumeBlur, not C++.

EDIT: Are you running OpenCL on CPU or GPU?

EDIT2: Oh sorry, I see now, it's a GPU... 

Link to comment
Share on other sites

Sorry but I don't have time to argue back and forth with you on this. If my tests do not meet your standards of benchmarking, then simply disregard them and move on.

I will continue using them in production and get shots done on time, and not worry about whether I am comparing code or a node, etc.

Link to comment
Share on other sites

Sorry, if you feel offended, it wasn't my intention. My remark clearly referred to the expression, you used ("it's a bit misleading") , not the essence of your tool or its usefulness. It's might be useful in production and I've never daubed that. Peace!

Link to comment
Share on other sites

12 minutes ago, pusat said:

I wanna give that a try sometime. I hope H16 ships with Intel drivers with the ability to run OpenCL code on the CPU on a per node basis.

Would be super cool if you could be on the H16 beta! Hopefully @johneris showing your tests to the dev team there saying - 'we should hire Yunus'!;)

Edited by tar
Link to comment
Share on other sites

  • 1 month later...
  • 5 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...