quannabe
quannabe

Reputation: 384

Groups of size 2 based on distance

I have a list of several thousand points of length N with Latitude and Longitude for each.

I would like to group these points into N/2 groups with each containing 2 points (one will have 3 if N is odd).

The aim of this grouping is to minimize distance between the 2 points. We could consider error for each group as distance between points squared. And total error sum of error over all groups.

Given the constraint that the algorithm should be relatively fast (this will be deployed on API and run in response to user request), what would be best algorithm to achieve this?

The grouping doesn't necessarily need to be the 'best' possible, but preferably deterministic.

Upvotes: 1

Views: 1788

Answers (2)

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77454

Compute the center.

Sort points by distance from the center.

In descending order, pick the next unmatched point and pair it with the nearest unmatched neighbor. You can use the triangle inequality to keep the candidates small.

With an index, this greedy approach is O(n log n) otherwise O(n^2). It's probably not the best result, but it should be fairly good for this running time. The presorting avoids the really bad cases (as long as the center isn't too unbalanced).

Upvotes: 1

Daniel
Daniel

Reputation: 7724

first, we can define a quadrant function:

int quadrant(point a){
    if(a.latitude > 0)
        if(a.longitude > 0) return 1;
        else                return 2;
    else
        if(a.longitude < 0) return 3;
        else                return 4;
}

and then we can sort the points like this:

bool comparison(point a, point b){
    if (quadrant(a) == quadrant(b)){
        if(a.longitude > b.longitude) return true;
        else return a.longitude > b.longitude;
    else 
        return quadrant(a) < quadrant(b);
}

the quadrant function works just as a cartesian plane, and will help you to put points together.

the sorting based on a comparison like that will help you to group near points together, so you can make pairs (a1,a2), (a3,a4), ..., (an-1, an) and it will take O(N lg(N)) time.

you can optimize a little if you treat better the adjacent quadrants or do some more logics, but this is a good start.

Upvotes: 0

Related Questions