Reputation: 24672

Best data structure for nearest neighbour in 1 dimension

I have a list of values (1-dimensional) and I would like to know the best data structure / algorithm for finding the nearest to a query value I have. Most of the solutions (all?) I found for questions here are for 2 or more dimensions. Can anybody suggest to me the approach for my case?

My instinct tells me to sort the data and use binary search somehow. By the way, there is no limit on the construction or insertion time for any tree needed, so probably someone can suggest a better tree than simply a sorted list.

Upvotes: 10

Answers (5)

J D

Reputation: 48707

Using OCaml's Set:

module S = Set.Make(struct type t = int let compare = compare end)

let nearest xs y =
  let a, yisin, b = S.split y xs in
  if yisin then y
  else
      let amax, bmin = S.max_elt a, S.min_elt b in
      if abs (amax - y) < abs (bmin - y) then amax else bmin

Incidentally, you may appreciate my nth-nearest neighbor sample from OCaml for Scientists and The F#.NET Journal article Traversing networks: nth-nearest neighbors.

Upvotes: 0

quantumSoup

Reputation: 28192

Sort the list and use binary search to find the element you are looking for, then compare your left and right neighbors. You can use an array which is O(1) access.

Something like:

int nearest(int[] list, int element) {

    sort(list);
    int idx = binarySearch(element, list);

    // make sure you are accessing elements that exist
    min = (element - list[idx-1] <= list[idx+1] - element) ? idx-1 : idx+1;

    return list[min];
}

This is O(n log n), which will be amortized if you are going to perform many look ups.

EDIT: For that you'd have to move the sorting out of this method

Upvotes: 1

mathmike

Reputation: 1014

If you need something faster than O(log(n)), which you can easily get with a sorted array or a binary search tree, you can use a van Emde Boas Tree. vEB trees give you O(log(log(n))) to search for the closest element on either side.

Upvotes: 9

Eyal Schneider

Reputation: 22456

If insertion time is irrelevant, then binary search on a sorted array is the simplest way to achieve O(log N) query time. Each time an item is added sort everything. For each query, perform a binary search. If a match is found, return it. Otherwise, the binary search should return the index of the item, where it should have been inserted. Use this index to check the two neighboring items and determine which of them is closer to the query point.

I suppose that there are solutions with O(1) time. I will try to think of one that doesn't involve too much memory usage...

Upvotes: 2

swegi

Reputation: 4112

As you already mentioned, the fastest and easiest way should be sorting the data and then looking for the left and right neighbour of a data point.

Upvotes: 1

Best data structure for nearest neighbour in 1 dimension

Answers (5)

Related Questions