Finding nearest neighbor(s) in a KD Tree

Question

Warning: Fairly long question, perhaps too long. If so, I apologize.

I'm working on a program involving a nearest neighbor(s) search of a kd tree (in this example, it is an 11 dimensional tree with 3961 individual points). We've only just learned about them, and while I have a good grasp of what the tree does, I get very confused when it comes to the nearest neighbor search.

I've set up a 2D array of points, each containing a quality and a location, which looks like this.

struct point{
  double quality;
  double location;
}

// in main
point **parray; 
// later points to an array of [3961][11] points

I then translated the data so it has zero mean, and rescaled it for unit variance. I won't post the code as it's not important to my questions. Afterwards, I built the points into the tree in random order like this:

struct Node {
  point *key;
  Node *left;
  Node *right;
  Node (point *k) { key = k; left = right = NULL; }
};

Node *kd = NULL;
// Build the data into a kd-tree
random_shuffle(parray, &parray[n]);
for(int val=0; val



Pretty standard stuff. If I've used random_shuffle() incorrectly, or if anything is inherently wrong about the structure of my tree, please let me know. It should shuffle the first dimension of the parray, while leaving the 11 dimensions of each in order and untouched.

Now I'm on to the neighbor() function, and here's where I've gotten confused.

The neighbor() function (last half is pseudocode, where I frankly have no idea where to start):

Node *neighbor (Node *root, point *pn, int d, 
                Node *best, double bestdist) {
  double dist = 0;
  // Recursively move down tree, ignore the node we are comparing to
  if(!root || root->key == pn) return NULL;

  // Dist = SQRT of the SUMS of SQUARED DIFFERENCES of qualities
  for(int dim=1; dimkey->quality, 2);
  dist = sqrt(dist);

  // If T is better than current best, current best = T
  if(!best || dist


Here's the call to neighbor in main(), mostly uncompleted. I'm not sure what should be in main() and what should be in the neighbor() function:

// Nearest neighbor(s) search
double avgdist = 0.0;
// For each neighbor
for(int i=0; i


As you can see, I'm stuck on these last two sections of code. I've been wracking my brain over this for a few days now, and I'm still stuck. It's due very soon, so of course any and all help is appreciated. Thanks in advance.

Finding nearest neighbor(s) in a KD Tree

Answers (1)

Related Questions