kaymes
kaymes

Reputation: 185

Spatial index with sum and count for C++

I'm looking for an implementation of a spatial index that allows me to quickly count and sum up the values that are contained in a specified region.

The longer version: I have a lot of objects that I want to store in a spatial index. They each have their coordinates in the n-dimensional space as well as one extra value. Given a range I need a quick answer to the questions of (1) how many objects are within the range and (2) what is the sum of all their values.

I know that a spatial index is usually implemented using R-trees. Of course I could simply retrieve all the objects within a range and sum them up each time.

However, it seems like there is a significant speed up opportunity by storing the sum and count of all elements contained under a node within that very node. Thus, once the node in question is entirely within the queried range, it is not necessary to descend the tree any further.

Does anyone know a C++ implementation that supports these kind of "cached" operations?

Upvotes: 3

Views: 626

Answers (2)

Adam Wulkiewicz
Adam Wulkiewicz

Reputation: 2098

So what you're thinking about is an augumented R-tree. It's interesting though I'm guessing to benefit from this augumentation the query regions would have to be quite big WRT the bounding boxes of nodes and values stored in the R-tree. Otherwise the query would be forced to still always check leaf nodes (but there'd be an overhead - counters, additional checks).

Indeed as Justin R. said Boost.Geometry R-tree implementation doesn't store any counters in nodes, allow to define additional data stored in nodes or user-defined queries, at least for now (Boost 1.57).

However it'd be possible to optimize this counting query. It's not required to return any Values, create and fill a temporary container, etc. Instead values could be counted on-the-fly during the query, e.g. like this in C++11:

size_t counter = 0;
rtree.query(bgi::intersects(box),
            boost::make_function_output_iterator(
                [&](Value const&) {
                    counter++;
                }));

Upvotes: 1

Justin R.
Justin R.

Reputation: 24061

Boost has a good R-tree implmentation, though I don't think the functionality that you're looking for is built-in.

One approach would be to modify your node's data type to include an additional field to represent the subtree metadata (child count and subtree sum), or make the node a tuple of your current type and the metadata. Whenever you add, edit or delete a node, those functions would call an update function that would walk up chain of parent nodes, incrementing or decrementing the metadata.

I suspect that if you are going to bullk-loading the data, this is even easier, as you can do it in just two passes, one to go through and calculate the metadata for each node, and then do a series of inserts that don't perform the update function.

If you're not going to bulk-load, another common spatial index is the quadtree. This data structure is often better suited to spatial data that is frequently updated, as it doesn't need to re-balance all the time. I use quadtrees more than R-trees, and find them super flexible.

Upvotes: 3

Related Questions