Jump to content

Recommended Posts

Hello,

I'm new to HDK and c++. It's at the moment for some learning purpose.

I've succefully made a first node to advect some data inside a volume by a velocity volume. I've made a version with the THREADED_METHOD2 to multithread my calculation on each voxel but inside Houdini my node is slower than the first version. ( I've checked and it used all my processors and only one in first version). I guess it's possible to gain some performance on a volume so it seems I'm not doing it the good way. 

Here are some samples of the code :

single threaded version :

for (int idx_z = 0; idx_z < voxel_size[2]; ++idx_z)
{
	for (int idx_y = 0; idx_y < voxel_size[1]; ++idx_y)
	{
		for (int idx_x = 0; idx_x < voxel_size[0]; ++idx_x)
		{
			UT_Vector3 pos;
			first_volume->indexToPos(idx_x, idx_y, idx_z, pos); // first_volume = GEO_PrimVolume*
				
			for (int i = 0; i < substeps; i++)
			{
				float velxv = vvelx->getValue(pos);   // vvelx = GEO_PrimVolume*
          			float velyv = vvely->getValue(pos);
          			float velzv = vvelz->getValue(pos);

				UT_Vector3 advect(velxv, velyv, velzv);
				pos += advect * amplitude/substeps;
			}
	
			float val = first_volume->getValue(pos);  

			volume_handleW->setValue(idx_x, idx_y , idx_z,  val ); //  volume_handleW = UT_VoxelArrayWriteHandleF
		}
	}
}

multithreaded version : 

class VOLADVECT
{
public:
	THREADED_METHOD2(            // Construct two parameter threaded method
		VOLADVECT,               // Name of class
		true,					 // Evaluated to see if we should multithread.
		advect,                  // Name of function
		int, substep,            // An integer parameter named substep
		float, amp)				 // A float parameter named amp
		void advectPartial(int substep, float amp, const UT_JobInfo &info);

	int sizex, sizey, sizez;
	UT_VoxelArrayWriteHandleF volume_handleW;
	GEO_PrimVolume* volume_data ;
	GEO_PrimVolume* volume_velx ;
	GEO_PrimVolume* volume_vely ;
	GEO_PrimVolume* volume_velz ;
	int myLength;
};

void VOLADVECT::advectPartial(int substep, float amp, const UT_JobInfo &info)
{
	int  i, n;

	UT_Vector3 voxel_size(volume_handleW->getRes(0), volume_handleW->getRes(1), volume_handleW->getRes(2));
	sizex =   (int)(voxel_size[0]);
	sizey =   (int)(voxel_size[1]);
	sizez =   (int)(voxel_size[2]);

	myLength = sizex * sizey* sizez;

	for (info.divideWork(myLength, i, n); i < n; i++)
	{

		int idx_x = i % sizex;
		int idx_y = (int)(i / sizex) % sizey;
		int idx_z = (int)(i / (sizex*sizey));

		UT_Vector3 pos;
		volume_data->indexToPos(idx_x, idx_y, idx_z, pos);

		for (int k = 0; k < substep; k++)
		{
			float velxv = volume_velx->getValue(pos);
			float velyv = volume_vely->getValue(pos);
			float velzv = volume_velz->getValue(pos);

			UT_Vector3 advect(velxv, velyv, velzv);
			pos += advect * amp / substep;
		}

		float val = volume_data->getValue(pos);

		volume_handleW->setValue(idx_x, idx_y, idx_z, val);
	}
}

They both work but as I'm learning it, I guess I'm not doing it the right way as I don't have better performance in the multithreaded version. I know vex or opencl sop would be easier and at least the same speed or faster in this example. 

If someone knows a better way or have some advices. thanks !

Edited by teresuac

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×