Reputation: 1337
I tried to implement a balanced K-D tree with O(kn log n), I used presorted K arrays (sorted arrays for each index) to get O(kn log n), and median to get balanced tree.
The problem I faced was that mostly the median value at some level ,for example the median for x axis, maybe chosen again at another subsequent level, for example for y axis.
I tried to solve this by dividing y sorted array to two arrays by using chosen x value as a pivot, but this way wouldn't yield a balanced tree.
Any idea how to get K-D balanced tree with O(kn log n)?
EDIT
Quoted from wiki https://en.wikipedia.org/wiki/K-d_tree
Alternative algorithms for building a balanced k-d tree presort the data prior to building the tree. They then maintain the order of the presort during tree construction and hence eliminate the costly step of finding the median at each level of subdivision. Two such algorithms build a balanced k-d tree to sort triangles in order to improve the execution time of ray tracing for three-dimensional computer graphics. These algorithms presort n triangles prior to building the k-d tree, then build the tree in O(n log n) time in the best case.[5][6] An algorithm that builds a balanced k-d tree to sort points has a worst-case complexity of O(kn log n).[7] This algorithm presorts n points in each of k dimensions using an O(n log n) sort such as Heapsort or Mergesort prior to building the tree. It then maintains the order of these k presorts during tree construction and thereby avoids finding the median at each level of subdivision.
Anyone could provide such algorithm provided above?
EDIT
The came up with a way but it doesn't work if there is any duplicate value of specific axis for the median.
For example
x1 = [ (0, 7), (1, 3), (3, 0), (3, 1), (6, 2) ] y1 = [ (3, 0), (3, 1), (6, 2), (1, 3), (0, 7) ]
The median of x-axis is 3. So when we want to split the array y11 and y12 we have to use > and < to distribute y array left and right considering pivot as delimiter.
there is no guarantee one of them is correct if the median a on specific axis is duplicated
Consider the partition on x axis, and there is no problem on x1 array following completion of above example of first step partition:
median=(3,0)
The pivot = 3 // is it's the median of x axis
y11[],y12[]
for(i = 0 ; i < x1.size;i++)
if(y1[i].getX()<pivot)
y11.add(y1[i])
else
if(y1[i].getX()>pivot)
y12.add(y1[i])
This will result y11 = [(2 ,1) , (1, 3), (0, 7) ] y12 = [ (6,2) ]
Any idea how to handle such case? Or is there any another presorting kd-tree presorting algorithm O(kn log n) ?
Upvotes: 2
Views: 2851
Reputation: 77454
When splitting the data, you need to retain the sort order.
E.g. using data (x,y)
we build
x1 = [ (0, 7), (1, 3), (3, 0), (4, 2), (6, 1) ]
y1 = [ (3, 0), (6, 1), (3, 2), (1, 3), (0, 7) ]
If we split at x now, we need to filter both sets by the record at x=3,y=0
.
I.e. split both lists, removing (3,0)
, all items with x<3
go to the first list each, all with x>3
go to the second (order unchanged):
x1 -> filter to x11 = [ (0, 7), (1, 3) ] x12 = [ (4, 2), (6, 1) ]
y1 -> filter to y11 = [ (1, 3), (0, 7) ] y12 = [ (6, 1), (4, 2) ]
The point is to filter each sorted list by the x values, while keeping the sorting order (so this is in O(n*k) in each of O(log n) levels). If you use only x1, and reconstruct y11 and y12 from x1, then you would need to sort again. By necessity, it is the same as if you sort once by x, once by y. Except that we did not sort again, only select.
I do not think this is much better in practise. Sorting is cheaper than the extra memory.
Upvotes: 1
Reputation: 2467
To elaborate on my comment (and Anony-Mousse's answer, probably):
The key idea with pre-sorting in constructing KD-trees is to keep the order during split. The overhead looks quite high, a comparative benchmark with re-sorting (and k-select) seems in order.
Some proof-of principle Java source code:
package net.*.coder.greybeard.sandbox;
import java.util.Arrays;
import java.util.Comparator;
import java.util.LinkedList;
/** finger exercise pre-sorting & split for KD-tree construction
* (re. https://stackoverflow.com/q/35225509/3789665) */
public class KDPreSort {
/** K-dimensional key, dimensions fixed
* by number of coordinates in construction */
static class KKey {
public static KKey[] NONE = {};
final Comparable[]coordinates;
public KKey(Comparable ...coordinates) {
this.coordinates = coordinates;
}
/** @return {@code Comparator<KKey>} for coordinate {@code n}*/
static Comparator<KKey> comparator(int n) { // could be cached
return new Comparator<KDPreSort.KKey>() { @Override
public int compare(KKey l, KKey r) {
return l.coordinates[n]
.compareTo(r.coordinates[n]);
}
};
}
@Override
public String toString() {
StringBuilder sb = new StringBuilder(
Arrays.deepToString(coordinates));
sb.setCharAt(0, '(');
sb.setCharAt(sb.length()-1, ')');
return sb.toString();
}
}
// static boolean trimLists = true; // introduced when ArrayList was used in interface
/** @return two arrays of {@code KKey}s: comparing smaller than
* or equal to {@code pivot} (according to {@code comp)},
* and greater than pivot -
* in the same order as in {@code keys}. */
static KKey[][] split(KKey[] keys, KKey pivot, Comparator<KKey> comp) {
int length = keys.length;
ArrayList<KKey>
se = new ArrayList<>(length),
g = new ArrayList<>(length);
for (KKey k: keys) {
// pick List to add to
List<KKey>d = comp.compare(k, pivot) <= 0 ? se : g;
d.add(k);
}
// if (trimLists) { se.trimToSize(); g.trimToSize(); }
return new KKey[][] { se.toArray(KKey.NONE), g.toArray(KKey.NONE) };
}
/** @return two arrays of <em>k</em> arrays of {@code KKey}s:
* comparing smaller than or equal to {@code pivot}
* (according to {@code comp)}, and greater than pivot,
* in the same order as in {@code keysByCoordinate}. */
static KKey[][][]
splits(KKey[][] keysByCoordinate, KKey pivot, Comparator<KKey> comp) {
final int length = keysByCoordinate.length;
KKey[][]
se = new KKey[length][],
g = new KKey[length][],
splits;
for (int i = 0 ; i < length ; i++) {
splits = split(keysByCoordinate[i], pivot, comp);
se[i] = splits[0];
g[i] = splits[1];
}
return new KKey[][][] { se, g };
}
// demo
public static void main(String[] args) {
// from https://stackoverflow.com/q/17021379/3789665
Integer [][]coPairs = {// {0, 7}, {1, 3}, {3, 0}, {3, 1}, {6, 2},
{12, 21}, {13, 27}, {19, 5}, {39, 5}, {49, 63}, {43, 45}, {41, 22}, {27, 7}, {20, 12}, {32, 11}, {24, 56},
};
KKey[] someKeys = new KKey[coPairs.length];
for (int i = 0; i < coPairs.length; i++) {
someKeys[i] = new KKey(coPairs[i]);
}
//presort
Arrays.sort(someKeys, KKey.comparator(0));
List<KKey> x = new ArrayList<>(Arrays.asList(someKeys));
System.out.println("by x: " + x);
KKey pivot = someKeys[someKeys.length/2];
Arrays.sort(someKeys, KKey.comparator(1));
System.out.println("by y: " + Arrays.deepToString(someKeys));
// split by x
KKey[][] allOrdered = new KKey[][] { x.toArray(KKey.NONE), someKeys },
xSplits[] = splits(allOrdered, pivot, KKey.comparator(0));
for (KKey[][] c: xSplits)
System.out.println("split by x of " + pivot + ": "
+ Arrays.deepToString(c));
// split "higher x" by y
pivot = xSplits[1][1][xSplits[1][1].length/2];
KKey[][] ySplits[] = splits(xSplits[1], pivot, KKey.comparator(1));
for (KKey[][] c: ySplits)
System.out.println("split by y of " + pivot + ": "
+ Arrays.deepToString(c));
}
}
(Didn't find a suitable answer/implementation on SE while not investing too much energy. The output was non-convincing with your example, with the longer one, I had to re-format it to believe it.
The code looks ugly, in all likelihood because it is: if so inclined re-read about the licence of code posted on SE, an visit Code Review.)
(Consider that there's voting, accepting, and awarding bounties, and re-visit Anony-Mousse's answer.)
Upvotes: 1