Reputation: 3387
I am looking for some data structures for range searching. I think range trees offer a good time complexity (but with some storage requirements).
However, it seems to me that other data structures, like KD-trees, are more discussed and recommended than range trees. Is this true? If so, why?
Upvotes: 2
Views: 1147
Reputation: 1800
For your needs (particle simulation in 2D or 3D), you want a data structure capable of all nearest neighbor queries. The cover tree is a data structure that is best suited for this task. I came across it while computing the nearest neighbors for kernel density estimation. This Wikipedia page explains the basic definition of the tree, and John Langford's page has a link to a C++ implementation.
The running time of a single query is O(c^12*logn), where c is the expansion constant of the dataset. This is an upper bound - in practice, the data structure performs faster than others. This paper shows that the running time of batch processing of all nearest neighbors (for all the data points), as needed for a particle simulation, is O(c^16*n), and this theoretical linear bound is also practical for your need. Construction time is O(nlogn) and storage is O(n).
Upvotes: 0
Reputation: 1198
I would expect that it is because kd-trees can straightforwardly be extended to contains objects other than points. This gives it many applications in e.g. virtual worlds, where we want quick querying of triangles. Similar extensions of range trees are not straightforward, and in fact I've never seen any.
To give a quick recap: a kd-tree can preprocess a set of n points in d-space in O(n log n) time into a structure using O(n) space, such that any d-dimensional range query can be answered in O(n1-1/d + k) time, where k is the number of answers. A range trees takes O(n logd-1n) time to preprocess, takes O(n logd-1n) space, and can answer range queries in O(logd-1n + k) time.
The query time for a range tree is obviously a lot better than that of a kd-tree, if we're talking about 2- or 3-dimensional space. However, the kd-tree has several advantages. First of all, it always requires only linear storage. Secondly, it is always constructed in O(n log n) time. Third, if the dimensionality is very high, it will outperform a range tree unless your points sets are very large (although arguably, at this point a linear search will be almost as fast as a kd-tree).
I think another important point is that kd-trees are more well known by people than range trees. I'd never heard of a range tree before taking a course in computational geometry, but I'd heard of and worked with kd-trees before (albeit in a computer graphics setting).
EDIT: You ask what is a better data structure for 2D or 3D fixed radius search when you have millions of points. I really can't tell you! I'd be inclined to say a range tree will be faster if you perform many queries, but for 3D the construction will be slower by a factor of O(log n), and memory use may become an issue before speed does. I'd recommend integrating good implementations of both structures and simply testing what does a better job for your particular requirements.
Upvotes: 1