Search the Community
Showing results for tags 'hdk volume multithreading'.
-
Hello, I'm new to HDK and c++. It's at the moment for some learning purpose. I've succefully made a first node to advect some data inside a volume by a velocity volume. I've made a version with the THREADED_METHOD2 to multithread my calculation on each voxel but inside Houdini my node is slower than the first version. ( I've checked and it used all my processors and only one in first version). I guess it's possible to gain some performance on a volume so it seems I'm not doing it the good way. Here are some samples of the code : single threaded version : for (int idx_z = 0; idx_z < voxel_size[2]; ++idx_z) { for (int idx_y = 0; idx_y < voxel_size[1]; ++idx_y) { for (int idx_x = 0; idx_x < voxel_size[0]; ++idx_x) { UT_Vector3 pos; first_volume->indexToPos(idx_x, idx_y, idx_z, pos); // first_volume = GEO_PrimVolume* for (int i = 0; i < substeps; i++) { float velxv = vvelx->getValue(pos); // vvelx = GEO_PrimVolume* float velyv = vvely->getValue(pos); float velzv = vvelz->getValue(pos); UT_Vector3 advect(velxv, velyv, velzv); pos += advect * amplitude/substeps; } float val = first_volume->getValue(pos); volume_handleW->setValue(idx_x, idx_y , idx_z, val ); // volume_handleW = UT_VoxelArrayWriteHandleF } } } multithreaded version : class VOLADVECT { public: THREADED_METHOD2( // Construct two parameter threaded method VOLADVECT, // Name of class true, // Evaluated to see if we should multithread. advect, // Name of function int, substep, // An integer parameter named substep float, amp) // A float parameter named amp void advectPartial(int substep, float amp, const UT_JobInfo &info); int sizex, sizey, sizez; UT_VoxelArrayWriteHandleF volume_handleW; GEO_PrimVolume* volume_data ; GEO_PrimVolume* volume_velx ; GEO_PrimVolume* volume_vely ; GEO_PrimVolume* volume_velz ; int myLength; }; void VOLADVECT::advectPartial(int substep, float amp, const UT_JobInfo &info) { int i, n; UT_Vector3 voxel_size(volume_handleW->getRes(0), volume_handleW->getRes(1), volume_handleW->getRes(2)); sizex = (int)(voxel_size[0]); sizey = (int)(voxel_size[1]); sizez = (int)(voxel_size[2]); myLength = sizex * sizey* sizez; for (info.divideWork(myLength, i, n); i < n; i++) { int idx_x = i % sizex; int idx_y = (int)(i / sizex) % sizey; int idx_z = (int)(i / (sizex*sizey)); UT_Vector3 pos; volume_data->indexToPos(idx_x, idx_y, idx_z, pos); for (int k = 0; k < substep; k++) { float velxv = volume_velx->getValue(pos); float velyv = volume_vely->getValue(pos); float velzv = volume_velz->getValue(pos); UT_Vector3 advect(velxv, velyv, velzv); pos += advect * amp / substep; } float val = volume_data->getValue(pos); volume_handleW->setValue(idx_x, idx_y, idx_z, val); } } They both work but as I'm learning it, I guess I'm not doing it the right way as I don't have better performance in the multithreaded version. I know vex or opencl sop would be easier and at least the same speed or faster in this example. If someone knows a better way or have some advices. thanks !