Reputation: 5002
I have an application which is used for displaying and modifying huge volumes of point cloud data from lidar files (up to few gigabytes each, sometimes loaded in simultaneously). In the app the user is able to view a 2D image of loaded points (from the top) and select a profile to view in another window (from the side). Again this involves millions of points and they are displayed using OpenGL.
To handle the data there is also a quadtree library, which works, but is extremely slow. It has been used for some time, but recently the lidar point format changed and the LidarPoint object needed a number of attributes (class members) added, which cause it to grow in size in turn affecting the performance to almost unusable level (think 5 minutes to load a single 2GB file).
The quadtree currently consist of pointers to PointBucket objects which are simply arrays of LidarPoint objects with specified capacity and defined boundaries (for spatial queries). If the bucket capacity is exceeded it splits into four buckets. There is also kind of a caching system in place which causes point buckets to get dumped to disk when the point data is taking too much memory. These are then loaded back into memory if needed. Finally every PointBucket contains subbuckets/resolution levels which hold every n-th point of the original data and are used when displaying the data depending on the zoom level. That is because displaying few million points at once, while that level of detail is not necessary, is just extremely slow.
I hope you can get a picture from this. If not please ask and I can provide some more details or upload more code. For example here is the current (and slow) insert method:
// Insert in QuadTree
bool QuadtreeNode::insert(LidarPoint newPoint)
{
// if the point dosen't belong in this subset of the tree return false
if (newPoint.getX() < minX_ || newPoint.getX() > maxX_ ||
newPoint.getY() < minY_ || newPoint.getY() > maxY_)
{
return false;
}
else
{
// if the node has overflowed and is a leaf
if ((numberOfPoints_ + 1) > capacity_ && leaf_ == true)
{
splitNode();
// insert the new point that caused the overflow
if (a_->insert(newPoint))
{
return true;
}
if (b_->insert(newPoint))
{
return true;
}
if (c_->insert(newPoint))
{
return true;
}
if (d_->insert(newPoint))
{
return true;
}
throw OutOfBoundsException("failed to insert new point into any \
of the four child nodes, big problem");
}
// if the node falls within the boundary but this node not a leaf
if (leaf_ == false)
{
return false;
}
// if the node falls within the boundary and will not cause an overflow
else
{
// insert new point
if (bucket_ == NULL)
{
bucket_ = new PointBucket(capacity_, minX_, minY_, maxX_, maxY_,
MCP_, instanceDirectory_, resolutionBase_,
numberOfResolutionLevels_);
}
bucket_->setPoint(newPoint);
numberOfPoints_++;
return true;
}
}
}
// Insert in PointBucket (quadtree holds pointers to PointBuckets which hold the points)
void PointBucket::setPoint(LidarPoint& newPoint)
{
//for each sub bucket
for (int k = 0; k < numberOfResolutionLevels_; ++k)
{
// check if the point falls into this subbucket (always falls into the big one)
if (((numberOfPoints_[0] + 1) % int(pow(resolutionBase_, k)) == 0))
{
if (!incache_[k])
cache(true, k);
// Update max/min intensity/Z values for the bucket.
if (newPoint.getIntensity() > maxIntensity_)
maxIntensity_ = newPoint.getIntensity();
else if (newPoint.getIntensity() < minIntensity_)
minIntensity_ = newPoint.getIntensity();
if (newPoint.getZ() > maxZ_)
maxZ_ = newPoint.getZ();
else if (newPoint.getZ() < minZ_)
minZ_ = newPoint.getZ();
points_[k][numberOfPoints_[k]] = newPoint;
numberOfPoints_[k]++;
}
}
}
Now my question is if you can think of a way to improve this design? What are some general strategies when dealing with huge amounts of data that doesn't fit into memory? How can I make the quadtree more efficient? Is there a way to speed up rendering of points?
Upvotes: 4
Views: 2910
Reputation: 162317
Now my question is if you can think of a way to improve this design?
Yes: Don't store the objects itself in the quadtree. Put them into a flat structure (array, linked list, etc.) and have the Quadtree just keep a pointer to the actual objects. If the quadtree has a certain depth (on all nodes), you could flatten it as well.
Upvotes: 3