Jump to content

HDK Multithreading volumes


Recommended Posts

Hello,

I'm new to HDK and c++. It's at the moment for some learning purpose.

I've succefully made a first node to advect some data inside a volume by a velocity volume. I've made a version with the THREADED_METHOD2 to multithread my calculation on each voxel but inside Houdini my node is slower than the first version. ( I've checked and it used all my processors and only one in first version). I guess it's possible to gain some performance on a volume so it seems I'm not doing it the good way. 

Here are some samples of the code :

single threaded version :

for (int idx_z = 0; idx_z < voxel_size[2]; ++idx_z)
{
	for (int idx_y = 0; idx_y < voxel_size[1]; ++idx_y)
	{
		for (int idx_x = 0; idx_x < voxel_size[0]; ++idx_x)
		{
			UT_Vector3 pos;
			first_volume->indexToPos(idx_x, idx_y, idx_z, pos); // first_volume = GEO_PrimVolume*
				
			for (int i = 0; i < substeps; i++)
			{
				float velxv = vvelx->getValue(pos);   // vvelx = GEO_PrimVolume*
          			float velyv = vvely->getValue(pos);
          			float velzv = vvelz->getValue(pos);

				UT_Vector3 advect(velxv, velyv, velzv);
				pos += advect * amplitude/substeps;
			}
	
			float val = first_volume->getValue(pos);  

			volume_handleW->setValue(idx_x, idx_y , idx_z,  val ); //  volume_handleW = UT_VoxelArrayWriteHandleF
		}
	}
}

multithreaded version : 

class VOLADVECT
{
public:
	THREADED_METHOD2(            // Construct two parameter threaded method
		VOLADVECT,               // Name of class
		true,					 // Evaluated to see if we should multithread.
		advect,                  // Name of function
		int, substep,            // An integer parameter named substep
		float, amp)				 // A float parameter named amp
		void advectPartial(int substep, float amp, const UT_JobInfo &info);

	int sizex, sizey, sizez;
	UT_VoxelArrayWriteHandleF volume_handleW;
	GEO_PrimVolume* volume_data ;
	GEO_PrimVolume* volume_velx ;
	GEO_PrimVolume* volume_vely ;
	GEO_PrimVolume* volume_velz ;
	int myLength;
};

void VOLADVECT::advectPartial(int substep, float amp, const UT_JobInfo &info)
{
	int  i, n;

	UT_Vector3 voxel_size(volume_handleW->getRes(0), volume_handleW->getRes(1), volume_handleW->getRes(2));
	sizex =   (int)(voxel_size[0]);
	sizey =   (int)(voxel_size[1]);
	sizez =   (int)(voxel_size[2]);

	myLength = sizex * sizey* sizez;

	for (info.divideWork(myLength, i, n); i < n; i++)
	{

		int idx_x = i % sizex;
		int idx_y = (int)(i / sizex) % sizey;
		int idx_z = (int)(i / (sizex*sizey));

		UT_Vector3 pos;
		volume_data->indexToPos(idx_x, idx_y, idx_z, pos);

		for (int k = 0; k < substep; k++)
		{
			float velxv = volume_velx->getValue(pos);
			float velyv = volume_vely->getValue(pos);
			float velzv = volume_velz->getValue(pos);

			UT_Vector3 advect(velxv, velyv, velzv);
			pos += advect * amp / substep;
		}

		float val = volume_data->getValue(pos);

		volume_handleW->setValue(idx_x, idx_y, idx_z, val);
	}
}

They both work but as I'm learning it, I guess I'm not doing it the right way as I don't have better performance in the multithreaded version. I know vex or opencl sop would be easier and at least the same speed or faster in this example. 

If someone knows a better way or have some advices. thanks !

Edited by teresuac
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...