Reputation: 384
I have a list of several thousand points of length N with Latitude and Longitude for each.
I would like to group these points into N/2 groups with each containing 2 points (one will have 3 if N is odd).
The aim of this grouping is to minimize distance between the 2 points. We could consider error for each group as distance between points squared. And total error sum of error over all groups.
Given the constraint that the algorithm should be relatively fast (this will be deployed on API and run in response to user request), what would be best algorithm to achieve this?
The grouping doesn't necessarily need to be the 'best' possible, but preferably deterministic.
Upvotes: 1
Views: 1788
Reputation: 77454
Compute the center.
Sort points by distance from the center.
In descending order, pick the next unmatched point and pair it with the nearest unmatched neighbor. You can use the triangle inequality to keep the candidates small.
With an index, this greedy approach is O(n log n) otherwise O(n^2). It's probably not the best result, but it should be fairly good for this running time. The presorting avoids the really bad cases (as long as the center isn't too unbalanced).
Upvotes: 1
Reputation: 7724
first, we can define a quadrant function:
int quadrant(point a){
if(a.latitude > 0)
if(a.longitude > 0) return 1;
else return 2;
else
if(a.longitude < 0) return 3;
else return 4;
}
and then we can sort the points like this:
bool comparison(point a, point b){
if (quadrant(a) == quadrant(b)){
if(a.longitude > b.longitude) return true;
else return a.longitude > b.longitude;
else
return quadrant(a) < quadrant(b);
}
the quadrant function works just as a cartesian plane, and will help you to put points together.
the sorting based on a comparison like that will help you to group near points together, so you can make pairs (a1,a2), (a3,a4), ..., (an-1, an) and it will take O(N lg(N)) time.
you can optimize a little if you treat better the adjacent quadrants or do some more logics, but this is a good start.
Upvotes: 0