Jump to content

How to get the point number with the lowest float attribute value?


Recommended Posts

Guest mantragora

It's a lame to use HDK, where VEX can handle!

... very true.

So it's good that I used HDK.

Because VEX version, when I bumped number of points to 600000 and compared with spreadsheet and my HDK version, only sorting in spreadsheet and HDK version found correct points/values.

VEX couldn't handle it :D.

Guys, could you test your solutions with higher number of points because I'm getting wrong values also from other solutions here. I made also mixed Vex/Python version and it also failed at higher amount of points.

DOWNLOAD

Edited by mantragora
  • Downvote 1
Link to comment
Share on other sites

Guest mantragora

Symek, it looks that VEX solution starts to fail when you cross around 100016 points. 100015 is still correct, anything above not. I don't know. Other are failing too. Maybe it's just my computer but HDK and spreadsheet sorting shows the same numbers.

EDIT: Hm, after couple thousands it again starts to work, but at 300000 it definitely give both points incorrectly.

Edited by mantragora
Link to comment
Share on other sites

Guest mantragora

On my dualcore , 4 years old comp (and it wasn't top of the line processor even then), and 1M points:

46.930 s for my python version

135 ms for HDK version.

Both returns min/max values at once. HDK also creates groups. You could speed up HDK version probably quite a bit with Attribute Reference Maps, but I still haven't figured out how to use them correctly. All I got was crash :).

Have you compared all versions values ? with 1M points Symek VEX method goes crazy on my comp. The only difference I made to his code, I used addgroup() to create two groups - "low"/"high"- instead of addattribute() to store results.

Edited by mantragora
  • Like 1
  • Downvote 1
Link to comment
Share on other sites

No I only checked Symek's method as I was working on his hip file, and the other methods generate different weight values for each point.

The python version is very slow but 135ms is very good :)

My VEX method could be also be sped up using some manual "threading". I modified the code and dropped the total time to 88ms. Basically instead of using 1 point for the first input, I used 10. Could be made to use the number of cores you have which would be the most optimal I think.

You still need to do one final pass at the end which is negligible given the remaining number of points. Also my machine is faster so your HDK version is most likely still faster :)

int minptnum = 0;
float minw = 1e9;
float w = 0;

int step = 100000;
int start = ptnum * step;
int end = (ptnum + 1) * step;

for(int i=start;i<end;i++)
{
    import( "weight", w, 1, i );
    if ( w < minw )
    {
        minptnum = i;
        minw = w;
    }
}
addattribute("minptnum", minptnum);

  • Like 1
Link to comment
Share on other sites

This thread seems pretty obsessed with speed and point counts. Sure, faster with more points is always great, but is it really needed all the time? If a system takes 0.5 seconds to calculate for 1 million points, but you only ever are going to have 10000 points, is a different system that also takes 0.5 seconds but for only 10000 points not acceptable? What if I told you I had a system that could handle 200 million points in less than 0.5 seconds? Or 4 lines of Python that can do the 1 million point test in 0.1 seconds? Does it really matter that the system can handle a ridiculous amount of data that you'll never need? In the interest of overall time I'd just stick with the simple method of sorting the points based on the value, having an extra attribute with the original point number on it and then just grab that from the first point after the sort. Quick and easy.

Also, for something like this, I wouldn't even do it from within a SOP. This is a simple query for information, so why not just build a system that doesn't need to be involved with a SOP? For example, as stated above, here's the simple Python that returns the min and max value point numbers. All it takes is a hou.Geometry object that can be from a cooking Python operator, or any other SOP.

vals = geometry.pointFloatAttribValues("myattribute")
min_val = min(vals)
max_val = max(vals)
result = (vals.index(min_val), vals.index(max_val))

The version that did 200 million points (4ms for 1 million points) was written simply using inlinecpp and basic iteration.

It's all overkill!

Link to comment
Share on other sites

You could speed up HDK version probably quite a bit with Attribute Reference Maps, but I still haven't figured out how to use them correctly. All I got was crash :).

I took a look at the files you attached and noticed you are basically still using the pre-H12/GA method of doing things. Using GEO_Point in H12 is slower than it was in H11 since it's now almost like a fake thing. Using the proper GA methods of accessing attributes and stuff would make your operator much faster.

Also, I don't see how AttributeRefMaps would be useful in this situation. They offer the ability to quickly copy attribute values around between different entities and details, which isn't really useful here.

Link to comment
Share on other sites

What if I told you I had a system that could handle 200 million points in less than 0.5 seconds?

lol I would say post that inline code please :)

I am not obsessed with speed. I am using my VEX method because it's simple and fast enough for me.

Also for my problem, I also have to handle groups, which is another reason VEX works well in this case.

Link to comment
Share on other sites

Guest mantragora
I took a look at the files you attached...

To many methods, not enough learning sources. Or not enough, clear enough sources. Most of the time here, if you ask anything about HDK, you end up talking to yourself thru couple posts.

  • Like 1
Link to comment
Share on other sites

Guest mantragora

So I changed the code a little, just loop. I hope that this is what Graham had it mind when he was talking that I'm using pre H12 GA methods. Calculation times felt to 40ms with 1M points, 330ms with 8M points, 2.1s with 50M points. You can compare it with the one that I posted above for download.

	GA_RWAttributeRef refh = gdp->findFloatTuple(GA_ATTRIB_POINT, attribute_NAME, 1);
		GA_RWHandleF hand(refh.getAttribute());

		if(hand.isValid())
		{
			for(GA_Iterator it(gdp->getPointRange()); !it.atEnd(); it.advance())
			{
				// get point number
				GA_Index ptnum = it.getIndex();
				// get attribute value
				fpreal att = (hand.get(it.getOffset()));

				if (att > _maxWeight)
				{
					_maxWeight = att;
					_maxPoint = ptnum;
				}
				if (att < _minWeight) 
				{
					_minWeight = att;
					_minPoint = ptnum;
				}
				if (ptnum == gdp->points().entries() - 1)
				{
					_lowestValueGroup->addIndex(gdp->points().entry(_minPoint)->getNum());
					_highestValueGroup->addIndex(gdp->points().entry(_maxPoint)->getNum());
				}

			}
		}

@Magneto: You can check attached scene. It's with *.dll compiled for 12.0.693. Most of the time I change build once for two weeks, and I pickup last one that is released before weekend.

BTW. it's for Windows.

DOWNLOAD

Edited by mantragora
  • Like 1
Link to comment
Share on other sites

So I changed the code a little, just loop...

That's definitely on the right track as your results indicate. Some additional improvements you could make:

  • You could use GA_Detail::findPointAttribute() and just pass along the name. (Not really a speed improvement, just a bit cleaner I guess)
  • You can dereference the iterator (*it) to get the offset, as opposed to it.getOffset().
  • You're still using the old GEO_Point method when you try and find the point number to add it to the group. You can think of a GA_Index as being the point number in this case. So instead of using the index to index into the list of GEO_Point objects and then using the getNum() func to get the point number you could just pass your _min/MaxPoint object to the GA_ElementGroup::addIndex() call. Even better would be to just ignore using indices all together and use offsets (_lowestValueGroup->addOffset(*it)). Using indices results in extra lookups when Houdini has to map the index back to the offset in the function.
  • Also, rather than testing your point number as being the last one in the list, why not just add the elements to the group after the for loop is done entirely?

Here is the code for my inlinecpp function. It's very similar to your new code plus some of the suggestions above. It's slightly different in that I'm not adding to a group but merely returning the point numbers and I also use page iteration, though it's not really necessary in this case.

IntArray getPointAttribMinMaxIndex(const GU_Detail *gdp, const char *attribute_name)
{
    float                       value, min_value, max_value;
    bool                        init = true;

    std::vector<int>            point_nums;

    GA_Offset                   pt, start, end, min_offset, max_offset;

    GA_ROAttributeRef           attr_gah;
    GA_ROPageHandleF            attr_ph;

    // Find the point attribute.
    attr_gah = gdp->findPointAttribute(attribute_name);

    // Attach a page handle.
    attr_ph.bind(attr_gah.getAttribute());

    // Iterate over the point range.
    for (GA_Iterator it(gdp->getPointRange()); it.blockAdvance(start, end); )
    {
        // Set the page start.
        attr_ph.setPage(start);

        // Iterate over the offets in the page.
        for (pt = start; pt < end; ++pt)
        {
            // Get the value for the offset.
            value = attr_ph.get(pt);

            // The first iteration through we need to set the min and
            // max values to the first element.
            if (init)
            {
                min_value = value;
                max_value = value;
                init = false;
                continue;
            }

            // Update the min value.
            if (value < min_value)
            {
                min_value = value;
                min_offset = pt;
            }

            // Update the max value.
            else if (value > max_value)
            {
                max_value = value;
                max_offset = pt;
            }
        }
    }

    // Add the min and max point indices to the array.
    point_nums.push_back(gdp->pointIndex(min_offset));
    point_nums.push_back(gdp->pointIndex(max_offset));

    return point_nums;
}

  • Like 1
Link to comment
Share on other sites

Guest mantragora

That's definitely on the right track as your results indicate. Some additional improvements you could make:

Yeah. So, by adding some tweaks here and there, computation times again lowered.

When using GA_Index for adding point to group:

  • 26ms for 1M points

When using "offset = *it" for adding point to group:

  • 20ms for 1M points
  • 1.2s for 50M points

EDIT: If you add check for user interrupt, computation time jumps up to 70ms.

Thank you for tips Graham.

Edited by mantragora
  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...