spurra
spurra

Reputation: 1025

Convex hull consisting of max. n points

Given a set of 2D points X, I would like to find a convex hull consisting of maximum n points. Of course, this is not always possible. Therefore, I am looking for an approximate convex hull consisting of max. n points that maximally covers the set of points X.

Stated more formally, if F(H,X) returns the amount of points of X the convex hull H covers, where |H| is the amount of points out of which the hull is constructed, then I seek the following quantity:

H_hat = argmax_H F(H,X), s.t |H|<=n

Another way to regard the problem is the task of finding the polygon consisting of max. n corners of a set X such that it maximally covers said set X.

The way I've come up with is the following:

X = getSet() \\ Get the set of 2D points
H = convexHull(X) \\ Compute the convex hull
while |H| > n do
    n_max = 0
    for h in H:
        H_ = remove(h,H) \\ Remove one point of the convex hull
        n_c = computeCoverage(X,H_) \\ Compute the coverage of the modified convex hull.
        \\ Save which point can be removed such that the coverage is reduced minimally.
        if n_c > n_max:
            n_max = n_c
            h_toremove = h
    \\ Remove the point and recompute the convex hull.
    X = remove(h,X)
    H = convexHull(X)

However, this is a very slow way of doing this. I have a large set of points and n (the constraint) is small. This means that the original convex hull contains a lot of points, hence the while-loop iterates for a very long time. Is there a more efficient way of doing this? Or are there any other suggestions for approximate solutions?

Upvotes: 2

Views: 2424

Answers (3)

spurra
spurra

Reputation: 1025

The following does not solve the question in the way I have formulated it, but it solved the problem which spawned the question. Therefore I wanted to add it in case anybody else encounters something similar.

I investigated two approaches:

1) Regards the convex hull as a polygon and apply a polygon simplification algorithm on it. The specific algorithm I investigated was the Ramer-Douglas-Peucker Algorithm.

2) Apply the algorithm described in in the question, without recomputing the convex hull.

Neither approaches will give you (as far as I can tell), the desired solution to the optimization problem stated, but for my tasks they worked sufficiently well.

Upvotes: 1

David Eisenstat
David Eisenstat

Reputation: 65506

A couple of ideas.

  1. The points that we exclude when deleting a vertex lie in the triangle formed by that vertex and the two vertices adjacent to it on the convex hull. Deleting any other vertex does not affect the set of potentially excluded points. Therefore, we only have to recompute coverage twice for each deleted vertex.

  2. Speaking of recomputing coverage, we don't necessarily have to look at every point. This idea doesn't improve the worst case, but I think it should be a big improvement in practice. Maintain an index of points as follows. Pick a random vertex to be the "root" and group the points by which triangle formed by the root and two other vertices contains them (O(m log m) time with a good algorithm). Whenever we delete a non-root vertex, we unite and filter the two point sets for the triangles involving the deleted vertex. Whenever we recompute coverage, we can scan only points in two relevant triangles. If we ever delete this root, choose a new one and redo the index. The total cost of this maintenance will be O(m log^2 m) in expectation where m is the number of points. It's harder to estimate the cost of computing coverage, though.

  3. If the points are reasonably uniformly distributed within the hull, maybe use area as a proxy for coverage. Store the vertices in a priority queue ordered by the area of their triangle formed by their neighbors (ear). Whenever we delete a point, update the ear area of its two neighbors. This is an O(m log m) algorithm.

Upvotes: 2

kafman
kafman

Reputation: 2860

May be the following approach could work for you: Initially, compute the convex hull H. Then select a subset of n points at random from H, which constructs a new polygon that might not cover all the points, so let's call it quasi-convex hull Q. Count how many points are contained in Q (inliers). Repeat this for a certain amount of times and keep the Q proposal with the most inliers.

This seems a bit related to RANSAC, but for this task, we don't really have a notion of what an outlier is, so we can't really estimate the outlier ratio. Hence, I don't know how good the approximation will be or how many iterations you need to get a reasonable result. May be you can add some heuristics instead of choosing the n points purely at random or you can have a threshold of how many points should at least be contained in Q and then stop when you reach that threshold.

Edit

Actually after thinking about it, you could use a RANSAC approach:

max_inliers = 0
best_Q = None
while(true):
    points = sample_n_points(X)
    Q = construct_convex_hull(points)
    n_inliers = count_inliers(Q, X)
    if n_inliers > max_inliers:
        max_inliers = n_inliers
        best_Q = Q
    if some_condition:
        break

The advantage of this is that the creation of the convex hull is faster than in your approach, as it only uses a maximum of n points. Also, checking the amount of inliers should be fast as it can be just a bunch of sign comparisons with each leg of the convex hull.

Upvotes: 1

Related Questions