Reputation: 1502

Python prune/filter out points

I have the following list of points that are steadily increasing, such as the following:

[[0, 0], [9, 4], [18, 19], [25, 34], [48, 48], [54, 53], [61, 65], [69, 82], [73, 86], [87, 99]]

but I happen to encounter some points which violate this pattern, as you can see from the below list:

[526, 590], [532, 599], [605, 539], [740, 519], [753, 539], [858, 700], [981, 615], [985, 539], [1105, 700], [1222, 590], [1359, 343], [1456, 86], [1617, 4], [1790, 44], [1885, 1927], [2016, 2008]

So, I need to find a way to prune/filter out these points which violate the pattern, in this case, points from [605, 539] to [1790, 44]

I try to use the following, to accept only the points whose y-coordinate lie between the y-coordinate of the previous point and the y-coordinate of the next point:

for i, pt in enumerate(points[1:-1]):
    x,y=cur_pt=points[i]
    x0,y0=prev_pt=points[i-1]
    x1,y1=next_pt=points[i+1]
    if y<y1 and y>y0:
        print 'acceptable point'
    else:
        print 'pruned'

The problem is that it prunes many valid points and leaves some points which need to be pruned.

The list of points is here:

points=[[0, 0], [9, 4], [18, 19], [25, 34], [48, 48], [54, 53], [61, 65], [69, 82], [73, 86], [87, 99], [93, 105], [96, 108], [98, 110], [99, 111], [100, 112], [106, 118], [119, 131], [128, 140], [134, 146], [137, 149], [139, 151], [140, 152], [141, 153], [147, 159], [160, 172], [185, 153], [213, 219], [215, 241], [219, 245], [223, 249], [236, 262], [247, 276], [249, 278], [274, 302], [282, 294], [288, 318], [313, 352], [365, 419], [377, 423], [416, 458], [435, 468], [468, 519], [481, 539], [508, 559], [526, 590], [532, 599], [605, 539], [740, 519], [753, 539], [858, 700], [981, 615], [985, 539], [1105, 700], [1222, 590], [1359, 343], [1456, 86], [1617, 4], [1790, 44], [1885, 1927], [2016, 2008], [2072, 2137], [2186, 2212], [2219, 2477], [2260, 2482], [2425, 2477], [2568, 2460], [2646, 2609], [2816, 2792], [2913, 2686], [2960, 2853], [3072, 2959], [3210, 2925], [3249, 2809], [3359, 2959], [3446, 3057], [3517, 2809], [3809, 2959], [4033, 3190], [4232, 3057], [4439, 2809], [4632, 3057], [4706, 2809], [4715, 1922], [4725, 1596], [5058, 1560], [5066, 1596], [5107, 1560], [5362, 1333], [5432, 1471], [5519, 1560], [5610, 1471], [5693, 249], [5782, 1471], [5968, 1560], [6068, 1471], [6105, 1560], [6298, 1700], [6390, 3765], [6416, 5926], [6446, 6440], [6503, 7300], [6511, 7332], [6522, 7342]]

Upvotes: 1

Answers (4)

hmghaly

Reputation: 1502

Thanks for your help everyone, I just figured out that I can take the mean y/x ratio (maximum y-coordinate)/maximum(x-coordinate), in this case about 1.1 and filter out the points which have the y/x ratio away from this ratio:

valid=[v for v in points if (v[0]>0.9*v[1] and v[0]<1.5*v[1]) or v[0]==v[1]]

Upvotes: 0

Bas Jansen

Reputation: 3343

You could perform a second iteration inside your first iteration, starting at the index of the first iteration. You can combine this with a boolean variable to mark if the currently examined variable (in iteration 2) is smaller than the variable of first iteration, like so:

#! /usr/bin/env python
points=points=[[0, 0], [9, 4], [18, 19], [25, 34], [48, 48], [54, 53], [61, 65], [69, 82], [73, 86], [87, 99], [93, 105], [96, 108], [98, 110], [99, 111], [100, 112], [106, 118], [119, 131], [128, 140], [134, 146], [137, 149], [139, 151], [140, 152], [141, 153], [147, 159], [160, 172], [185, 153], [213, 219], [215, 241], [219, 245], [223, 249], [236, 262], [247, 276], [249, 278], [274, 302], [282, 294], [288, 318], [313, 352], [365, 419], [377, 423], [416, 458], [435, 468], [468, 519], [481, 539], [508, 559], [526, 590], [532, 599], [605, 539], [740, 519], [753, 539], [858, 700], [981, 615], [985, 539], [1105, 700], [1222, 590], [1359, 343], [1456, 86], [1617, 4], [1790, 44], [1885, 1927], [2016, 2008], [2072, 2137], [2186, 2212], [2219, 2477], [2260, 2482], [2425, 2477], [2568, 2460], [2646, 2609], [2816, 2792], [2913, 2686], [2960, 2853], [3072, 2959], [3210, 2925], [3249, 2809], [3359, 2959], [3446, 3057], [3517, 2809], [3809, 2959], [4033, 3190], [4232, 3057], [4439, 2809], [4632, 3057], [4706, 2809], [4715, 1922], [4725, 1596], [5058, 1560], [5066, 1596], [5107, 1560], [5362, 1333], [5432, 1471], [5519, 1560], [5610, 1471], [5693, 249], [5782, 1471], [5968, 1560], [6068, 1471], [6105, 1560], [6298, 1700], [6390, 3765], [6416, 5926], [6446, 6440], [6503, 7300], [6511, 7332], [6522, 7342]]
accept=[]
for counter,i in enumerate(points): # Iteration 1
    y = i[1]
    flag = 0 # Set boolean to 0, indicating this value is still valid
    for j in points[counter:]: # Iteration2
        if int(j[1]) < y:
             flag = 1 # Set boolean to 1, indicating this value must be omitted
             break # No need to continue with iteration 2
    if flag == 0:
        accept.append(i)

This gives as output:

[[0, 0], [9, 4], [1617, 4], [1790, 44], [5693, 249], [5782, 1471], [6068, 1471], [6105, 1560], [6298, 1700], [6390, 3765], [6416, 5926], [6446, 6440], [6503, 7300], [6511, 7332], [6522, 7342]]

I hope that is what you wanted?

Upvotes: 0

jamylak

Reputation: 133554

>>> points = [526, 590], [532, 599], [605, 539], [740, 519], [753, 539], [858, 700], [981, 615], [985, 539], [1105, 700], [1222, 590], [1359, 343], [1456, 86], [1617, 4], [1790, 44], [1885, 1927], [2016, 2008]
>>> def prune(points):
        yield points[0]
        last = points[0]
        for point in points[1:]:
            if point[1] > last[1]:
                last = point
                yield point


>>> list(prune(points))
[[526, 590], [532, 599], [858, 700], [1885, 1927], [2016, 2008]]

Upvotes: 2

Aya

Reputation: 41950

I try to use the following, to accept only the points whose y-coordinate lie between the y-coordinate of the previous point and the y-coordinate of the next point...

Would it not make more sense to also check the x-coordinate with something like...

for i, pt in enumerate(points[1:-1]):
    x,y=cur_pt=points[i]
    x0,y0=prev_pt=points[i-1]
    x1,y1=next_pt=points[i+1]
    if y<y1 and y>y0 and x<x1 and x>x0:
        print 'acceptable point'
    else:
        print 'pruned'

Upvotes: 0

Python prune/filter out points

Answers (4)

Related Questions