xonegirlz
xonegirlz

Reputation: 8977

sort array of locations by nearest point

So the question is like this:

Given a location X and an array of locations, I want to get an array of locations that are closest to location X, in other words sorted by closest distance.

The way I solved this is by iterating through each location array and calculate the distance between X and that particular location, store that distance, and sort the location by distance using a Comparator. Done! Is there a better way to do this? Assuming that the sort is a merge sort, it should be O(n log n).

Upvotes: 6

Views: 2791

Answers (5)

moooeeeep
moooeeeep

Reputation: 32532

The problem you describe sounds like a m-nearest neighbor search to me.

So, if I correctly understood your question, i.e. the notion of a location being a vector in a multidimensional metric space, and the distance being a proper metric in this space, then it would be nice to put the array of locations in a k-d-Tree. You have some overhead for the tree building once, but you get the search for O(log n) then.

A benefit of this, assuming you are just interested in the m < n closest available locations, you don't need to evaluate all n distances every time you search for a new X.

Upvotes: 2

user684934
user684934

Reputation:

Proof by contradiction that you can't do better:

It is well known that comparison-based sorts are the only way to sort arbitrary numbers (which may include irrational numbers), and that they can't do better than n*log(n) time.

If you go through the list in O(n) time and select the smallest number, then use that as X, and somehow come up with a list of numbers that are sorted by distance to X, then you have sorted n numbers in less than O(n*log(n)) time.

Upvotes: 0

kyun
kyun

Reputation: 640

If I understand this right, you can do this pretty quickly for multiple queries - as in, given several values of X, you wouldn't have to re-sort your solution array every time. Here's how you do it:

  1. Sort the array initially (O(n logn) - call this pre-processing)
  2. Now, on every query X, binary search the array for X (or closest number smaller than X). Maintain, two indices, i and j, one which points to the current location, one to the next. One among these is clearly the closest number to X on the list. Pick the smaller distance one and put it in your solution array. Now, if i was picked, decrement i - if j was picked, increment j. Repeat this process till all the numbers are in the solution array.

This takes O(n + logn) for each query with O(nlogn) preprocessing. Of course, if we were talking about just one X, this is not better at all.

Upvotes: 2

Patrick87
Patrick87

Reputation: 28302

You can't do asymptotically better than O(n log n) if using a comparison-based sort. If you want to talk about micro-optimization of the code, though, some ideas include...

  1. Sort by squared distance; no reason to ever use sqrt() - sqrt() is expensive
  2. Only compute squared distance if necessary; if |dx1| <= |dx2| and |dy1| <= |dy2|, pt1 is closer than pt2 - integer multiplication is fast, but avoiding it in many cases may be somewhat faster

A thinking-outside-the-box solution might be to use e.g. Bucket sort... A linear time sorting algorithm which might be applicable here.

Upvotes: 0

Saurabh
Saurabh

Reputation: 7964

You should try using min-heap ds to implement this. Just keep on storing the locations in heap with key = diff of X and that location

Upvotes: 0

Related Questions