Volume Convolution: VEX & C++ vs OpenCL

animatrix · August 22, 2016

Volume convolution on the GPU using OpenCL.

For 27M voxels using 100 iterations, OpenCL is 650 times faster than C++ and 12525 times faster than VEX.

johner · August 22, 2016

Hi Yunus,

Just a minor point, for your VolumeWrangle approach you can just do:

sum += volumeindex(0, "density", set(@ix-1, @iy, @iz));

which will skip the position calc and linear interpolation and be more similar to the OpenCL code (though still way slower!)

animatrix · August 22, 2016

Thanks John, good idea. I will try to update the code and video. Hopefully this will improve VEX benchmark a bit

symek · August 22, 2016

Also "faster than C++" is a bit misleading. You don't actually bench C++ code versus OpenCL code, but VolumeBlurSOP versus your OCL implementation of convolution (which is impressive btw, but still it's NOT proper C++ vs. OpenCL comparision

animatrix · August 22, 2016

I disagree. Volume Blur SOP is written in C++ so it's comparing a C++ implementation of a volume convolution by SESI none the less to a possible OpenCL version.

From SESI wrt Volume Convolve SOP:

"There pretty much is zero cost for the voxels with a 0 multiplier in them. Because volume convolve 3x3x3 has a known stencil we can stream perfectly, avoiding any random access. VEX will always be slower than it."

Edited August 22, 2016 by pusat

symek · August 22, 2016

The title suggest you're compering C++ to OpeCL to VEX, but in fact you're compering SESI C++ node (not code) to your code. It's not apples to apples.

animatrix · August 22, 2016

Not really, it's just semantics. You could also wrap the AttribWrangle node into an HDA, black box it, and compare it just the same. By your terms this too would not be comparing VEX code, but indeed it is.

symek · August 22, 2016

Quote

Not really(...)

Well. really. Sorry, it's not very important after all, but it's like ABC of bench-marking and comparison studies. It's simply not technically possible that the some algorithm expressed in OpenCL is 650 times faster than its implementation in C++ - on the same hardware. If algorithm differs, or hardware differs, or one is multi-threaded and second is not, you are not entitled to say "OpenCL is X times faster then C++". Because it's not, something else does also matter, like hardware, or implementation details. Plain and simple. You would rather say, my OpenCL code is 650 faster than Houdini's own VolumeBlurSOP. Peroid. Which is still great result, but refers rather to VolumeBlur, not C++.

EDIT: Are you running OpenCL on CPU or GPU?

EDIT2: Oh sorry, I see now, it's a GPU...

animatrix · August 22, 2016

Sorry but I don't have time to argue back and forth with you on this. If my tests do not meet your standards of benchmarking, then simply disregard them and move on.

I will continue using them in production and get shots done on time, and not worry about whether I am comparing code or a node, etc.

symek · August 22, 2016

Sorry, if you feel offended, it wasn't my intention. My remark clearly referred to the expression, you used ("it's a bit misleading") , not the essence of your tool or its usefulness. It's might be useful in production and I've never daubed that. Peace!

August 22, 2016

Very nice! Would be cool to see the OpenCL version run on the CPU too.

animatrix · August 22, 2016

1 hour ago, marty said:

Very nice! Would be cool to see the OpenCL version run on the CPU too.

I wanna give that a try sometime. I hope H16 ships with Intel drivers with the ability to run OpenCL code on the CPU on a per node basis.

August 22, 2016

12 minutes ago, pusat said:

I wanna give that a try sometime. I hope H16 ships with Intel drivers with the ability to run OpenCL code on the CPU on a per node basis.

Would be super cool if you could be on the H16 beta! Hopefully @johneris showing your tests to the dev team there saying - 'we should hire Yunus'!

Edited August 22, 2016 by tar

animatrix · August 22, 2016

I would love to help SESI if I can

kiko · October 12, 2016

I agree with symek that it is not a valid comparison because algorithm implementations are different.

prashantcgi · March 30, 2017

Is there a example file available?

Thanks

Sign In

Volume Convolution: VEX & C++ vs OpenCL

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest tar

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest tar

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation