Jump to content

Python Performance Issues


Adam Ferestad

Recommended Posts

So I am working on a project which is requiring me to extract the point numbers from an arbitrary polygonal object.  The most pythonic way to do what I am trying to do is this:
 

geo = hou.Geometry()
geo.loadFromFile(path/to/my/file.bgeo.sc)
primPoints = [[pt.number() for pt in pr.points()] for pr in geo.prims()]

I need to maintain the winding order and get it down to just the ints for the point numbers.  I have profiled every function in this, and hou.Prim.points() is god awful slow, taking up 16 of 22 seconds on a 1.5M poly, 800k point file.  I am trying to find a way to circumvent hou.Prim.points() but nothing is jumping out to me.  I have even moved everything into Pandas, which does make working with the data and doing further transformations easier,  but I have not gotten any performance increases as it still relies on calling hou.Prim.points() for every primitive.

I have contemplated trying to use hou.Geometry.points() and calling hou.Point.prims() to go at it from the other direction, thinking that it might save me some time since the list of points is always shorter than the list of prims, but it loses the winding order, which I need for my use case.  Said use case is parsing cached geometry into a custom format for optimized use elsewhere.  I have been able to get the performance up to snuff on just about everything else, and will save even more time on later steps thanks to Pandas data management, but this singular step is so sluggish that it just about tanks the whole project.  Literally everything else in the pipeline that I am developing is under 4-5 seconds/frame for huge geometry volumes in the pre-process step (which this is part of) and the final end of the process is so fast I'm not even going to talk about it.  Literally just this is bogging things down in an untenable way.

 

Pandas code for reference:

import pandas as pd

geo = hou.Geometry()
geo.loadFromFile(path/to/my/file.bgeo.sc)

primsSeries = pd.Series(data=geo.prims())
primsSeries = primsSeries.apply(hou.Prim.points)
primsSeries = primsSeries.apply(lambda x: list(map(hou.Point.number, x)))

 

The two take virtually identical amounts of time with the pure pythonic one being a couple seconds faster because it is doing the hou.Point.number in the same step as the list generation, whereas the Pandas code currently does it as a secondary step.  I'm sure I could include it in the hou.Prim.points apply, but this was mostly separated for profiling to see where all of my speed was going.

 

Does anyone out there have any idea how to bypass hou.Prim.points in this process in favor of something faster or am I 100% stuck?

Link to comment
Share on other sites

Is loading the geometry directly from Python a requirement ?

If no, then here's what I did.
I used a file node to load the geo, then with a Wrangle I pre-process the pts[] array on each prim.

i[]@pts = primpoints(0, @primnum); // In a Wrangle set to run over Primitives, with the File as an input

Then, instead of "[pt.number() for pt in pr.points()]", I do "pr.intListAttribValue("pts")"

node = hou.pwd()
geo = node.geometry() # Need to load the input geo instead of directly from the disk

primPoints = [pr.intListAttribValue("pts") for pr in geo.prims()]

I did some performance testing
Original speed is 21s, if we load with the File node (but keep the same primPoints line), it's 19s, but with the aforementioned method the time it takes goes down to 4s.
image.thumb.png.003faaac1f33835f9a26d5f524c8fb3b.png

python_performance_example.hipnc

Link to comment
Share on other sites

 

3 hours ago, Alain2131 said:

Is loading the geometry directly from Python a requirement ?

If no, then here's what I did.
I used a file node to load the geo, then with a Wrangle I pre-process the pts[] array on each prim.


i[]@pts = primpoints(0, @primnum); // In a Wrangle set to run over Primitives, with the File as an input

Then, instead of "[pt.number() for pt in pr.points()]", I do "pr.intListAttribValue("pts")"


node = hou.pwd()
geo = node.geometry() # Need to load the input geo instead of directly from the disk

primPoints = [pr.intListAttribValue("pts") for pr in geo.prims()]

I did some performance testing
Original speed is 21s, if we load with the File node (but keep the same primPoints line), it's 19s, but with the aforementioned method the time it takes goes down to 4s.
image.thumb.png.003faaac1f33835f9a26d5f524c8fb3b.png

python_performance_example.hipnc

This is an interesting result, and I will have to examine things.  It is imperative that it is implemented within Hython.  The final functionality is going to call Hython from outside Houdini and use it to manage data.  The code has to be able to be entirely within a .py library set.  I am wondering if I could implement this using the hou.runvex() function.  Unfortunately I have done literally 0 vex compiling and would need to figure that out.  If I could do that and see these sorts of improvements in my code, then I would be attempting backflips.

Link to comment
Share on other sites

@Alain2131  I am attempting to get a VFL file written which will allow me to utilize this method.  I knew how fast and easy it would have been in VEX, but unfortunately there is no good go between for the languages and the Hython restriction is a rock solid one.  Eventually we may build a C++ DLL for it from the HDK, but for the time being we are using our app to interface directly with Hython, so it has to be in Hython or accessible from Hython.

Unfortunately there are no verbs available which can utilize VEX, so the only option is hou.runVex() and compiling my own vex file to encapsulate the primpoints() VEX function.  I have attempted, and failed.  I cannot seem to figure out how to type the object right to be able to pass the HOM code, which doesn't surprise me as it is a Python object.  I could possibly pass the file name, but loading and unloading files from memory to extract int lists on single prims seems woefully inefficient. 

Link to comment
Share on other sites

Well, bummer.

I don't know much about compiled .vex library, but I was able to have a very simple example work.

Here are the resources I found :
https://forums.odforce.net/topic/26970-creating-vex-libraries/?do=findComment&comment=155341
https://www.sidefx.com/docs/houdini/vex/vcc
https://www.sidefx.com/docs/houdini/hom/hou/runVex.html#example

Basically, you'll need to place the generated .vex file in this path "%userprofile%\Documents\houdini19.0\vex\include" (you'll need to create the folders "vex" and "include")
(Although, now that I think about it, this is only really needed for .vfl to be included in VEX. For python, that's not relevant, and can be anywhere you like)

I created a code.vfl file with this content. (the path of the .vfl does not matter, but I put it in the same vex/include folder)

cvex add(float x=0; float y=0; export float out=0)
{
    out = x + y;
}

Then, in a Command Prompt window, I typed :

C:\Users\alain>cd C:\Program Files\Side Effects Software\Houdini 19.0.531\bin

C:\Program Files\Side Effects Software\Houdini 19.0.531\bin>vcc -o C:\Users\alain\Documents\houdini19.0\vex\include\code.vex C:\Users\alain\Documents\houdini19.0\vex\include\code.vfl

That creates a code.vex in the same folder, which can be referenced in Houdini.

I used a simple python sop with this code

codeFile = r"C:/Users/alain/Documents/houdini19.0/vex/include/code.vex"
result = hou.runVex(codeFile, {"x":1.0, "y":0.5})

print(result)

Here's an image with all the info.
1. Path to put stuff into
2. The code.vfl file
3. Compiling code.vex
3.5 Showing code.vex's insides (not relevant)
4. Calling hou.runVex() in Houdini
image.thumb.png.d88170146a077b8ec9affc2991ea314b.png

And this successfully runs vex code. Yay.

The lack of enthusiasm is because we need much more than that, and I'm not able to get any further than that.
We need to be able to call "primpoints()" in there, but that requires having access to a geometry stream, which I don't know how to pass through.

This page says 

Quote

Currently this function only works with functions in the cvex context.

I don't know the difference between cvex and sop, but that might be relevant.

I'm sorry I can't help you more than that, but that might be a dead end.

HDK sounds like the all around best way to go about this.

That said, I'd love if someone more knowledgeable than me could shed light on this.
Or come up with another solution that doesn't require VEX !

 

Edited by Alain2131
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...