Popular Post animatrix Posted June 13, 2016 Popular Post Share Posted June 13, 2016 (edited) This operator allows you to run an OpenCL kernel as part of your SOP network. Depending on the GPU, some operators can be orders of magnitude faster than even VEX. In this case the OpenCL code is 144 times faster than VEX on GTX 970. Edited July 11, 2016 by pusat 14 Quote Link to comment Share on other sites More sharing options...
pezetko Posted June 14, 2016 Share Posted June 14, 2016 Nice idea! I tried that today and it works great. Thanks! Time to dust off OpenCL to create some useful kernels. 1 Quote Link to comment Share on other sites More sharing options...
petz Posted June 14, 2016 Share Posted June 14, 2016 (edited) hey yunus, nice one! still - isn´t your example a rather unfair comparison between ocl and vex since its not the same basic setup? judging from the picture it seems that in the vex-version you are doing the neighbour lookup inside the loop while in opencl it is outside the loop. i just did a quick test and while ocl is much faster indeed (around 15 times) i´m far off getting the same speedup you have. on the other hand, i ran the test on a laptop with a fairly old gpu and 15 times is still a lot faster! petz opencl_smooth.hipnc Edited June 14, 2016 by petz 2 Quote Link to comment Share on other sites More sharing options...
pezetko Posted June 14, 2016 Share Posted June 14, 2016 I compared that OpenCL smooth against Smooth SOP on 1000x1000 grid on the the Xeon E3 1650 v3 vs Nvidia GTX760 and speed up is really big. It depends on the size of the data set. Only issue is on the systems without GPU (OpenCL support) or when there is not enough GPU RAM. petz: If you set dopnet to Timeless and on the Dop Object turn on Solve on Creation Frame it will be solved for current frame so you don't have to do Timeshift to next frame (but Timeshift is still useful to remove time dependency). 1 Quote Link to comment Share on other sites More sharing options...
animatrix Posted June 15, 2016 Author Share Posted June 15, 2016 (edited) 16 hours ago, petz said: hey yunus, nice one! still - isn´t your example a rather unfair comparison between ocl and vex since its not the same basic setup? judging from the picture it seems that in the vex-version you are doing the neighbour lookup inside the loop while in opencl it is outside the loop. i just did a quick test and while ocl is much faster indeed (around 15 times) i´m far off getting the same speedup you have. on the other hand, i ran the test on a laptop with a fairly old gpu and 15 times is still a lot faster! petz Good point Chris. I did the neighbour lookup outside for OpenCL because there is no other way to my knowledge. I didn't think of doing the same in VEX as an optimization. This new change made OpenCL ~30x faster as opposed to 124x faster on my PC. On a similar topic, is there a way to make a DOPnet truly time independent? I think it's rather limiting not to be able to use a DOPnet as a time-independent SOP operation. Only way I could think of is by using TimeShift and pushing a literal frame value when the current frame is changed via a callback but this seems very hacky and not robust to me. Edited June 15, 2016 by pusat Quote Link to comment Share on other sites More sharing options...
petz Posted June 16, 2016 Share Posted June 16, 2016 On 15.6.2016 at 4:00 PM, pusat said: On a similar topic, is there a way to make a DOPnet truly time independent? I think it's rather limiting not to be able to use a DOPnet as a time-independent SOP operation. apart from using timeshift, i don´t think there is much you can do, at least not as far as i know... Quote Link to comment Share on other sites More sharing options...
animatrix Posted June 16, 2016 Author Share Posted June 16, 2016 I understand. But if you use anything frame related in Timeshift then it also becomes time dependent. If you hard code the field, then you can't say the current frame or the next frame as it will have to be changed when the current frame is changed. Timeless option should make the DOPnet truly time independent. Otherwise it's not possible to replicate it I think. Quote Link to comment Share on other sites More sharing options...
petz Posted June 19, 2016 Share Posted June 19, 2016 On 16.6.2016 at 10:15 PM, pusat said: Timeless option should make the DOPnet truly time independent. +1 1 Quote Link to comment Share on other sites More sharing options...
animatrix Posted June 22, 2016 Author Share Posted June 22, 2016 (edited) You can download the operator here:OpenCL Wrangle SOP Relax geometry OpenCL kernel shown in the video is included as a preset inside the HDA. Time to put those Titans to use Edited June 22, 2016 by pusat Quote Link to comment Share on other sites More sharing options...
animatrix Posted February 28, 2017 Author Share Posted February 28, 2017 Time to retire my OpenCL Wrangle SOP and use the new OpenCL Wrangle in H16 instead. But you can pretty much copy paste the same code you wrote for mine into the H16 one. SESI did great work for that especially many default SOPs using it internally. Quote Link to comment Share on other sites More sharing options...
luoqiulin Posted June 3, 2017 Share Posted June 3, 2017 (edited) Hi pusat, I'm not a programmer but i'm very interesting in opencl, can you enlighten me something about this? in houdini\ocl\sim\analysis.cl there is some code I'm not surely understand __kernel void curlAligned(__global const float *u, __global const float *v, __global const float *w, __global float *x, __global float *y, __global float *z, const float inv2dx, const float inv2dy, const float inv2dz, uint offset, const uint ystride, const uint zstride) { const size_t idx = offset + get_global_id(0) + get_global_id(1) * ystride + get_global_id(2) * zstride; float dwdy = dudxAligned(w, idx, ystride, inv2dy); float dvdz = dudxAligned(v, idx, zstride, inv2dz); float dudz = dudxAligned(u, idx, zstride, inv2dz); float dwdx = dudxAligned(w, idx, 1, inv2dx); float dvdx = dudxAligned(v, idx, 1, inv2dx); float dudy = dudxAligned(u, idx, ystride, inv2dy); x[idx] = dwdy - dvdz; y[idx] = dudz - dwdx; z[idx] = dvdx - dudy; } will you please explain something for us? 1, what is *u *v *w *z *y *z, or what's different between *u *v *w and *x *y *z; 2,what's inv2dx and ystride, why don't need xstride; 3,how do the idx compute out,what's the offset doing; thank you very very much! Edited June 3, 2017 by luoqiulin Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.