Vasu
Vasu

Reputation: 1110

Speeding up selecting sets of 3 nodes that form triangles with given min and max side length

I've got a list of about 60 points, from which I'm generating all possible combinations of size 3. I'm trying to determine whether any of the 3 points are within certain distance parameters - that is, not too close to each other, and not too far from each other (say, no point can be less than 10 units to the next nearest point, and no point can be more than 100 units from the furthest point).

I've got code to do so here:

def setPalmCombinations(self):
    # Testing confirms that the range of distances is about 5 - 10 for
    # minimum distance between points, and anywhere from 150 - 200 for max
    # distance between points. 10 and 100 should be reasonable cutoffs.
    minDistance = 10
    maxDistance = 100
    startTime = time.time()
    iter = itertools.combinations(self.handContour, 3)
    def distanceCheck(points):
        A = points[0][0]
        B = points[1][0]
        C = points[2][0]
        AB = np.linalg.norm(A - B)
        if not(minDistance < AB < maxDistance): return None
        BC = np.linalg.norm(B - C)
        if not(minDistance < BC < maxDistance): return None
        CA = np.linalg.norm(C - A)
        if not(minDistance < CA < maxDistance): return None
        return np.array([A, B, C])
    a = [distanceCheck(i) for i in iter]
    print time.time() - startTime

However, just generating this list of possible combinations takes roughly .4 to .5 seconds, which is far too slow. Is there a way I can optimize this calculation (or fundamentally redo the algorithm so that it's not simply a brute force search) so that I can get the time down to about a realtime calculation (that is, 30ish times a second).

I was thinking about possibly generating a list of the distances between every point, sorting it, and looking for the points with distances toward the center of the cutoff range, but I think that may actually be slower than what I'm doing now.

I'm also trying to figure out some more intelligent algorithm given the fact that the points are originally stored in order of how they define the contour - I should be able to eliminate some points before and after each point in the list when making the combinations ... however, when I tried to generate the combinations using this fact with Python's default for loops, it ended up being slower than simply using itertools and generating everything.

Any advice is appreciated.

Thanks!

Upvotes: 3

Views: 424

Answers (3)

PeterE
PeterE

Reputation: 5855

If the number of contour points is very large you might want to try an heuristic approach. That is take three random points, (check min and max distances,) calculate the radius of the corresponding circle, compare to previous calculated radius, repeat. You would have to figure in some kind of stopping criteria, like max number of iterations, or no better radius found over the last x iterations, or radii growth rate below x percent, or some combination of those.

Apart from that, in regards to your clarification comment under your question: Either your terminology or your approach is off. There is no guaranty that a circle going through thee of the contour points is inside the convex hull of the contour points. Imagine for example a spread hand, the shape will have concave segments, if all three points are within one such segment the circle will lie outside the hand. What I think you mean is the circle inscribed within the convex hull of each three point set.

Upvotes: 1

Oliver W.
Oliver W.

Reputation: 13459

I would first calculate the pairwise distance between all points, filter out those that meet your specifications (the dist_min and dist_max) and then check if a combination of three such nodes still matches the criteria (because I'm only checking pairwise - once, for speed).

import itertools
import numpy as np
from scipy.spatial.distance import pdist, squareform

points = np.random.rand(100,2)  # could be nodes in 3 dimensions too, but you specify a handpalm, so I'm guessing your contour is merely 2D

def palm_combinations(points, dist_min=.0, dist_max=.05):
    dists = pdist(points)
    # This matrix lists the distance between each pair of nodes 
    # (represented by their column and row index, so that the mat[i,j] 
    # represents the distance between points[i,:], points[j,:])
    mat = squareform(dists)  

    mask = (mat > dist_min) & (mat < dist_max)

    two_good_legs = {}  # store all combinations of point indices where "two legs" of the triangles have a distance that matches the spec.
    myinds = []
    for rowind, row in enumerate(mask):
        relevant = row[rowind+1:]  # distance to a point itself is always zero, hence exclude from list
        inds = np.where(relevant)[0] + (rowind+1)
        myinds.append(inds)
        two_good_legs[rowind] = list(itertools.combinations(inds, 2))

    triangles = []
    # The only thing left to do now is to check if the third leg's norm is also within spec.
    for k,v in two_good_legs.items():
        for tup in v:
            if tup[1] in myinds[tup[0]]:
                triangles.append((k, tup[0], tup[1]))
    return triangles

triangles will list all the indices within the points that meet your specifications. If you want to get back to actual coordinates, just get points[triangles,:]. You could visualize all this with:

import maptlotlib.pyplot as plt
plt.triplot(points[:,0], points[:,1], triangles)
plt.plot(points[:,0], points[:,1], 'ko ')

Timing differences between your implementation (modified to exclude the unknown [0] indexing, see my comment) show this:

In [12]: %timeit  my_implementation(points, .1, .15)
1 loops, best of 3: 1.58 ms per loop

In [13]: %timeit your_implementation(points, .1, .15)
1 loops, best of 3: 2.32 s per loop

With points being of the shape (100,2), so about a 1470x speed increase on my machine.

You could increase the speed slightly, if you get rid of the call to squareform, which is the bottleneck according to the profiler, and it isn't strictly necessary but it makes the code more readable ("Readability counts." cfr. the Zen of Python, by Tim Peters). For that you would make the following changes:

def palm_combinations(points, dist_min=.0, dist_max=.05):
    dists = pdist(points)
    mask = (dists > dist_min) & (dists < dist_max)
    two_good_legs, myinds, triangles = {}, [], []
    s = points.shape[0]
    ind_u1, ind_un = 0, s - 1
    for rowind in xrange(s-1):
        relevant = mask[ind_u1:ind_un]
        ind_u1 = ind_un
        ind_un += s - 1*(rowind+2)
        inds = np.where(relevant)[0] + (rowind+1)
        myinds.append(inds)
        two_good_legs[rowind] = list(itertools.combinations(inds, 2))
    for k,v in two_good_legs.items():
        for tup in v:
            if tup[1] in myinds[tup[0]]:
                triangles.append((k, tup[0], tup[1]))
    return triangles

However, the speed gains won't be noticeable for small datasets.

Upvotes: 2

Imanol Luengo
Imanol Luengo

Reputation: 15909

You can speedup using numpy. Assuming comb is your already calculated possible combinations, you can easily vectorize distance calculations:

>>> import numpy as np
>>> comb = np.array(list(comb))
>>> dists = comb - np.roll(comb, -1)
# L2 norm
>>> dists = dists**2

dists will be a Nx3 array containing AB BC and CA distances respectively.

Mask it with your conditions:

>>> mask = (dists > minDist) & (dists < maxDist)

Find rows where all distances satisfy the conditon:

>>> rows = np.all(mask, axis=1)

return combinations that satisfy conditions:

>>> comb[rows]

UPDATE: Just noticed that this is still quite slow for large N. Here it is the function that contains all the above steps:

>>> def palm_combinations(points, min_dist=10, max_dist=100): 
        combs = np.array(list(itertools.combinations(points, 3)))
        dists = (combs - np.roll(combs, -1))**2
        mask = (dists > min_dist) & (dists < max_dist)
        return combs[np.all(mask, axis=1)]

For a sample data of 100 points:

>>> a = np.random.rand(100) * 500
>>> %timeit palm_combinations(a)
10 loops, best of 3: 79.5 ms per loop

The bottleneck of the function is the generation of all the combinations:

>>> %timeit combs = np.array(list(itertools.combinations(a, 3)))
10 loops, best of 3: 69.2 ms per loop

UPDATE 2: It can be speeded-up with a bit more complicated approach. Using np.fromiter:

# Define combinations data type (3 float each combination)
>>> dt = np.dtype('f,f,f')
# Fast numpy array conversion
>>> combs = np.fromiter(itertools.combinations(points, 3), dt)

The function would be something like:

>>> def palm_combinations(points, min_dist=10, max_dist=100): 
        dt = np.dtype('f,f,f')
        combs = np.fromiter(itertools.combinations(points, 3), dt)
        combs = combs.view('f').reshape(-1,3)
        dists = (combs - np.roll(combs, -1))**2
        mask = (dists > min_dist) & (dists < max_dist)
        return combs[np.all(mask, axis=1)]

And the speed test:

>>> %timeit palm_combinations(a)
10 loops, best of 3: 29.8 ms per loop

For N=100

Upvotes: 2

Related Questions