Reputation: 1346
I was going through a section in CLR that describes about using a divide and conquer approach to find the closest pair of points using euclidean distance between two points.
There is a problem that asks to find the closest pair of points using manhatten distance between the two using a similar approach. However, I am not able to grasp the difference between the two. Following is what I could think of:
1) X and Y be two arrays with points sorted on x- and y- coordinates respectively.
2) Find a line 'l' that divides the set of points between a subset Pl and Pr with all points less then or equal to 'l' going to Pl and otherwise to Pr. Use Xl and Xr be two arrays containing points from Pl and Pr. Same goes for Yl and Yr.
3) Recurse till our subset of points <= 3 (use brute force in this case)
4) Minimum distance could be the one returned from any of the recursive calls - call it d.
5) Find all points within 2d width around line 'l' and then for each such point find its manhatten distance with (all ??) points within that strip ?
6) return the minimum from step (5) and (4)
It is in step 5) where I am stuck. Not able to figure out the logic behind choosing the number of points to compare with.
To be more specific: It is mentioned in CLR that for euclidean distances, we need to check only 7 points, And after searching over google this reference states that for Manhattan distances, we need to consider 12 points. I am unable to understand this logic of how this choice of points is dependent on the type of distance we are seeking.
Upvotes: 4
Views: 5542
Reputation: 152
For each point in the left part of L, there are at most 6 points in the right with which the distance is less than d. These 6 points can be positioned at the six corners of the two squres . With one more, a pair with a distance less than d can be found.
Then sort the points in the right side of L by y-coordinates. Find the 6 points which are closest to P in y-axis, which may produce a closer distance.
Upvotes: 3
Reputation: 18556
During the 5) step, it is enough to compare each point only with those points that have cur_y - d <= y <= cur_y(if they were split by x coordinates, cur_y is a y coordinate of the current point), because any point located lower than cur_y - d is definitely further than d from the current one.
Upvotes: 0