plasmacel
plasmacel

Reputation: 8530

Predict the required number of preallocated nodes in a kD-Tree

I'm implementing a dynamic kD-Tree in array representation (storing the nodes in std::vector) in breadth-first fashion. Each i-th non-leaf node have a left child at (i<<1)+1 and a right child at (i<<1)+2. It would support incremental insertion of points and collection of points. However I'm facing problem determining the required number of possible nodes to incrementally preallocate space.

I've found a formula on the web, which seems to be wrong:

N = min(m − 1, 2n − ½m − 1),

where m is the smallest power of 2 greater than or equal to n, the number of points.

My implementation of the formula is the following:

size_t required(size_t n)
{
    size_t m = nextPowerOf2(n);
    return min(m - 1, (n<<1) - (m>>1) - 1);
}

function nextPowerOf2 returns a power of 2 largest or equal to n

Any help would be appreciated.

Upvotes: 4

Views: 428

Answers (1)

Irvan
Irvan

Reputation: 449

Each node of a kd-tree divides the space into two spaces. Hence, the number of nodes in the kd-tree depends on how you perform this division:

1) If you divide them in the midpoint of the space (that is, if the space is from x1 to x2, you divide the space with the x3=(x1+x2)/2 line), then: i) Each point will be allocated its own node, and ii) Each intermediate node will be empty.

In this case, the number of nodes will depend on how large the coordinates of the points are. If the coordinates are bounded by |X|, then the total number of nodes in the kd-tree should be slightly less than log |X| * n (more precisely, around log |X| * n - n log n + 2n) in the worst case. To see this, consider the following way to add the points: you add multiple collections, each collection has two extremely nearby points located at random. For each pair of point, the tree will need to continuously divide the space log |X| times, and if log |X| is significantly larger than log n, creating log |X| intermediate nodes in the process.

2) If you divide them by using a point as a dividing line, then each node (including the intermediate nodes) will contain a point. Thus, the total number of nodes is simply n. However, note that using a point to divide the space may yield to a very bad performance if the points are not given in a random order (for example, if the points are given in an ascending order of X, the depth of the tree would be O(n). For comparison, the depth of the tree in (1) is at most O(log |X|) ).

Upvotes: 1

Related Questions