Ouroboros
Ouroboros

Reputation: 1524

Performance of RTree vs kd-trees

I have around 10 K points in 5 dimensional space. We can assume that the points are randomly distributed in space (0,0,0,0,0) and (100,100,100,100,100). Clearly, the whole data set can easily reside in memory.

I would like to know which algorithm for k nearest neighbour would run faster, kd-tree or RTree.

Although I have some very high level idea of these two algorithms, I am not sure which will run faster, and why. I am open to exploring other algorithms if any, which could run fast. Please, if possible, specify why an algorithm may run faster.

Upvotes: 3

Views: 4835

Answers (1)

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77454

This depends on various parameters. Most importantly on your capability to implement these algorithms.

I've personally found bulk-loaded R*-trees to be faster for large data, probably because they have a better fan-out. Bulk-loaded R-trees is a more fair comparison, as kd-trees are commonly bulk-loaded (in fact, they don't support incremental operation very well at all).

For tiny data, kd-trees will likely be faster, plus they are much simpler to implement.

For other things, please refer to this earlier question / answer:

https://stackoverflow.com/a/11109467/1060350

Upvotes: 5

Related Questions