Jump to content
Julien BOLBACH

Delete scatter intersections

Recommended Posts

Hi guys,

I'm trying to find a way to delete points of a scatter that are too close to each other.
My idea is :
- I calculate the nearpoints for each point of the scatter, within the pscale distance of each point
- For all those nearpoints, I want to delete them, and keep the original point, but more importantly, I want to make sure that I won't run the algorithm on those points again, otherwise I will delete too many points.
- I've made a working prototype, but it's very slow. I'm doing it through a detail wrangle, storing the points I want to keep, to make sure they wouldn't be deleted later. This is obviously not optimised and fairly slow for high numbers of points, but I haven't found another way to make it work. 

I attached a sample file of what I'm doing, any suggestion to make this faster would be really appreciated. There might be other ways of achieving what I am after, so please do let me know if you have ideas.

Thanks a lot,

Julien

deleteIntersections_v002.hipnc

Share this post


Link to post
Share on other sites

Hey Julien,

I'm not sure of a good way to keep one of the two points from being deleted, but maybe this can give you a starting point for a faster method using PC open

@pscale = .7;
int maxNeighbours = 1;
int handle = pcopen ( 0, "P", @P, @pscale, 1+maxNeighbours );
int neighbourCount = pcnumfound ( handle ) - 1;
pcclose ( handle );
v@Cd.r = fit(neighbourCount,0,maxNeighbours,0,1);
v@Cd.g = v@Cd.r;
v@Cd.b = v@Cd.r;

if (@Cd.r == 1){
removepoint(geoself(),@ptnum);
}

This way seems significantly faster, but still needs something to keep it from over deleting points in the heavily clustered sections.

Share this post


Link to post
Share on other sites

Hi guys,

I had the same idea and made a Python version with the same result as the vex version (same points), and got significantly faster as well (x10), but it is still relatively slow for the number of points (12 seconds for 40k points, 2mins in Vex).

deleteIntersections_v003.hipnc

I would still love to find a faster way if possible, specially if I want to be able to get this working for 100+Ks of points ...

Cheers,

Julien

Edited by Julien BOLBACH

Share this post


Link to post
Share on other sites

So the big issue here, as you are probably aware, is that you run your wrangle in detail mode which is not multi threaded and therefore the hit in speed.

I suggest you to do something like this which is super fast (although it might not be what you are after).

run over points:

int pts[] = nearpoints(0,@P,@pscale*chf("distance"));

foreach(int pt; pts){
    if(pt <= @ptnum)
        continue;

    vector pt_pos = point(0, 'P', pt);
    float pt_pscale = point(0, 'pscale', pt);
    if(distance(@P, pt_pos) < @pscale + pt_pscale)
        {
        removepoint(geoself(),@ptnum);
        }
}

 

if you set the distance to 2, you essentially remove every point around it, thus in theory your points should not be flagged again by another point.  However as the calculations occur simultaneously this is not bullet proof. I suggest you to use a point relax to space out the few that it might have missed.  

I apologize if I totally missed the point (HAH!) of your post, I am but a noob trying to help as much as I can :P

 

sorry for not being able to upload a file atm..

Share this post


Link to post
Share on other sites

Hi Jesper,

Your solution seems to work really well !

As you mentioned, I used the wrangle in detail in the first place the calculations run in parallel, and I was unable to store and query my "keep" attribute.

That being said, the removepoint seems to cover for that, as the point doesn't exist anymore, it does the trick. I'm still not 100% sure what you can and cannot do when running parallel calculations, but this seems to be working just fine (and it's lighting fast !).

I just did this adjustment : promote the pscale into detail, as max, and use this pscale*2 to unsure I would sample all the points around the max size. 

One last thing, what does the if(pt<=@ptnum) continue; does ? It seems a bit arbitrary to me, is it just to make sure to remove the points in order and not mess up the point numbers ?

Thanks a lot !

Julien

Share this post


Link to post
Share on other sites
2 minutes ago, Julien BOLBACH said:

One last thing, what does the if(pt<=@ptnum) continue; does ?

my guess is this, the candidates for nearpoints are given in order with the closest one first....which is itself...so i would have just skipped the 1st one instead...is that right ? or am i talking out of my rear end ?

  • Like 1

Share this post


Link to post
Share on other sites
10 hours ago, Noobini said:

my guess is this, the candidates for nearpoints are given in order with the closest one first....which is itself...so i would have just skipped the 1st one instead...is that right ? or am i talking out of my rear end ?

That is correct.

consider p0 and  p1. The distance between them is Pscale*1.5 thus the points gets flagged. so the iteration over p0 we remove p1 and for iteration over p1 remove p0. This leaves us with no points at all.. so the if(pt<=@ptnum) contrinue; is the condition for not removing all the points. 

  • Like 1

Share this post


Link to post
Share on other sites

somehow, I think I'm still talking crap tho....ie. i was wrong in saying skipping the 1st item in the array...more likely it should be doing a self check.

Using above logic, you'll get down to the last 2 points and that's it...you cannot delete the last 2 even if you increase Distance to a bazillion...so I thought there must be a way...

So I tweaked the line if(pt <= @ptnum) to

if(pt == @ptnum) ie. here are the list of points closest to @ptnum....check distance and delete if appropriate...but of course, exclude checking against YOURSELF.

And so, with this new logic, you can delete by Distance right down to the last point...all gone..but then you could argue...well, there should be ONE last point left.....oh I don't know.....head hurts enough already...

Share this post


Link to post
Share on other sites
13 minutes ago, Noobini said:

somehow, I think I'm still talking crap tho....ie. i was wrong in saying skipping the 1st item in the array...more likely it should be doing a self check.

Using above logic, you'll get down to the last 2 points and that's it...you cannot delete the last 2 even if you increase Distance to a bazillion...so I thought there must be a way...

So I tweaked the line if(pt <= @ptnum) to

if(pt == @ptnum) ie. here are the list of points closest to @ptnum....check distance and delete if appropriate...but of course, exclude checking against YOURSELF.

And so, with this new logic, you can delete by Distance right down to the last point...all gone..but then you could argue...well, there should be ONE last point left.....oh I don't know.....head hurts enough already...

sure you can get down to 1 point  if wannt

its not super clear in the code (because I was  lazy and didnt comment it) but the pscale * distance is not determining if the point gets deleted or not, its deciding the search area. the pscale on the points incoming is your threshold for deletion.

 

On 04/01/2018 at 11:54 AM, Jesper Rahlff said:

float pt_pscale = point(0, 'pscale', pt);
    if(distance(@P, pt_pos) < @pscale + pt_pscale)

I am importing the pscale on the current iterating point, and then I am adding pscale value from the wrangle node (in this case same value but it does not have to be). so if you on your incoming points increase pscale you could delete more points.

Share this post


Link to post
Share on other sites

thank YOU...(I rarely ask question coz I'm so friggin' stubborn....MUST work out this problem if it kills me !!! and so each day...I'm closer to my grave...)

Share this post


Link to post
Share on other sites
12 minutes ago, Noobini said:

thank YOU...(I rarely ask question coz I'm so friggin' stubborn....MUST work out this problem if it kills me !!! and so each day...I'm closer to my grave...)

I am the same way.

remember this though, "Only those who dare to fail greatly, can ever achieve greatly" - John F. Kennedy

Share this post


Link to post
Share on other sites

Hey guys,

Thanks for this.

I was still bothered by this if (npt <= @ptnum) line and your explanation on it, but it does all the trick. Here is why :

- Consider this simple example :

deleteScatter2.jpg.85a4d29deb06e695b4f684d0c32eb391.jpg

- As you mentioned, result from the nearpoints are in order of their distance. But the point numbers can be completely random :

   - nearpoints for @ptnum = 1 -> [6,2,3,5,4]

   - nearpoints for @ptnum = 2 -> [6,1,4,5,3]

   - etc.

- So the statement of skipping points by comparing the ptnum to pt is not accurate. For sure, you want to skip itself so it should at least be if (pt == @ptnum ) continue; But this removes too many points still.

- BUT, by skipping the points before ptnum, you assume that those points have already been "taken care of". So skipping them makes all the trick.

- Consider this : 

   - @ptnum = 1 : let's say nearpoints give you back [6,2] and pscale is big enough to get the point deleted.

   - @ptnum = 2 : nearpoints would give you back [6,1] and pscale is big enough to get the point deleted.

   - @ptnum = 6 : nearpoints would give you back [1,2]. if you keep the if (pt == @ptnum), the point would get deleted. But with (pt <= @ptnum), you skip 1 and 2 and end up not deleting the point. 

 I don't know if this was made on purpose Jesper, but this is definitely working :)

Thanks again,

Julien

Share this post


Link to post
Share on other sites
11 hours ago, Julien BOLBACH said:

Hey guys,

Thanks for this.

I was still bothered by this if (npt <= @ptnum) line and your explanation on it, but it does all the trick. Here is why :

- Consider this simple example :

deleteScatter2.jpg.85a4d29deb06e695b4f684d0c32eb391.jpg

- As you mentioned, result from the nearpoints are in order of their distance. But the point numbers can be completely random :

   - nearpoints for @ptnum = 1 -> [6,2,3,5,4]

   - nearpoints for @ptnum = 2 -> [6,1,4,5,3]

 

hang on....shouldn't that be nearpoints for @ptnum = 1 -> [1,6,2,3,5,4]

and nearpoints for @ptnum = 2 -> [2,6,1,4,5,3]

ie...the 1st closest point is always (well assume non-IDENTICALS for simplicity sake) itself...?

Share this post


Link to post
Share on other sites

Yes sorry, you are completely right, I was already assuming we all knew this already and forgot to add it to the example. 

But the rest of the explanation stands. 

Cheers

Julien

Share this post


Link to post
Share on other sites
On 07/01/2018 at 2:27 AM, Julien BOLBACH said:

Hey guys,

Thanks for this.

I was still bothered by this if (npt <= @ptnum) line and your explanation on it, but it does all the trick. Here is why :

- Consider this simple example :

deleteScatter2.jpg.85a4d29deb06e695b4f684d0c32eb391.jpg

- As you mentioned, result from the nearpoints are in order of their distance. But the point numbers can be completely random :

   - nearpoints for @ptnum = 1 -> [6,2,3,5,4]

   - nearpoints for @ptnum = 2 -> [6,1,4,5,3]

   - etc.

- So the statement of skipping points by comparing the ptnum to pt is not accurate. For sure, you want to skip itself so it should at least be if (pt == @ptnum ) continue; But this removes too many points still.

- BUT, by skipping the points before ptnum, you assume that those points have already been "taken care of". So skipping them makes all the trick.

- Consider this : 

   - @ptnum = 1 : let's say nearpoints give you back [6,2] and pscale is big enough to get the point deleted.

   - @ptnum = 2 : nearpoints would give you back [6,1] and pscale is big enough to get the point deleted.

   - @ptnum = 6 : nearpoints would give you back [1,2]. if you keep the if (pt == @ptnum), the point would get deleted. But with (pt <= @ptnum), you skip 1 and 2 and end up not deleting the point. 

 I don't know if this was made on purpose Jesper, but this is definitely working :)

Thanks again,

Julien

nice explanation Julien.

 

On 05/01/2018 at 11:09 AM, Jesper Rahlff said:

consider p0 and  p1. The distance between them is Pscale*1.5 thus the points gets flagged. so the iteration over p0 we remove p1 and for iteration over p1 remove p0. This leaves us with no points at all.. so the if(pt<=@ptnum) contrinue; is the condition for not removing all the points. 

This was my lazy way of trying to explain the same thing :P

Share this post


Link to post
Share on other sites

I'm not sure that you have specified exactly what you want to do; for example, do you only want to delete points, or would merging close neighbors be acceptable ? Because in that case, you already have Fuse that does that (not sure if Fuse deletes or merges points.. see the doc)  ; if not you can still use the output of the Fuse node to detect which points were changed, and you will have a smaller search area for what you really want to do; maybe spatially sort each set of points (before / after) with a Sort, then you can easily find which points are problematic with a for loop; maybe save the original @ptnum in an attrib so that you can tell which points are missing ans which survived the fuse.

Finding the closest pairs of points is done in n log(n) (See Computational Geometry chapter in Intro to algorithms, Cormen Rivest et al), and I assume its implementation in Houdini runs in n log(n)

Edited by AntoineSfx

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×