Reputation: 319
I need to create a multi-tree in a critical part of my code. Standard way is to use something like:
struct node{ // or class
... data ...
std::unordered_set<node*> children
};
node* root;
One important thing is that, once a node is created, on won't be deleted until the whole tree will be deleted.
The obvious way is to use the new and delete statements to manage these datas on the heap. Done.
I am wondering if it is worth exploring performances of using a std::vector instead of new
/delete
directly:
struct node{ // or class
... data ...
std::unordered_set<uint_32> children
};
std::vector<node> tree;
uint_32 root; // probably always 0, fist element of vector
Deleting the tree would be simply done with tree.clear();
Is there any chance that it would be faster for big (or huge) trees?
Upvotes: 2
Views: 575
Reputation: 10962
I'm guessing you have a tree in the size of some 10 million+ elements (or at least in the ballpark of 100k+ elements) to be concerned with this.
It then all depends on how your data and tree is used. Spatiality of data is usually of concern but its not obvious if that is your case too. If the tree is traversed a lot after being built (somewhat opposed to being modified a lot) and only being modified a little, then it makes sense to try to organize data such that it is iterated contiguous (as a std::vector
usually is). In this case, you do not want to have pointers like that or rather you do not want to use single element new
/delete
for sure as that is almost guarenteed to bug you down (as you cannot count on the elements being placed contiguous in memory) - but it all depends on your use case(s).
A simple method and radical different approach would be:
std::vector< std::pair<PathInTree, Node> > tree;
Then make it sorted such that the Node
s appear in the order that you are iterating the tree. This is a somewhat extreme solution and has drawbacks including that inserting can now become expensive if theres a lot of random insertion going on. Also I left out how to implement Path in the tree, but it should usually be some kind of sequence (like in a filesystem structure).
Upvotes: 2
Reputation: 6298
Considering that each instance of std::unordered_set<uint_32>
is still going to result in at least one more allocation on the heap (internally), the actually achievable savings have a hard bound. You are going to save less than 50% of the necessary heap allocations.
One important thing is that, once a node is created, it won't be deleted until the whole tree will be deleted.
Re-allocations are still happening in an std::vector
as it grows. In contrast to immovable allocations on the heap, this additionally involves moving already allocated nodes. Depending on what other attributes your node
elements have, that may be even costlier. You will also need to make sure that your node
class is compatible with move semantics.
With std::unordered_set
also allocating on the heap, you won't be able to avoid the resulting memory fragmentation either.
All in all, unless you expect to be limited by the performance of heap allocations, or you have a use case for order-independent tree traversal, compacting node
instances is likely not going to be worth the effort.
A potentially reasonable optimization would be to swap std::unordered_set<node*>
for std::unordered_set<node>
. Unless you need that indirection in that place (external pointers?), that saves you just as much as your own attempt, without sacrificing debug capabilities.
Implementing the necessary hash function and equality operator for node
is usually easy.
If performance is a real concern, and you also have a (reasonable) upper bound on the number of children per node, you may consider replacing std::unordered_set<uint_32> children
by std::array<uint_32, n> children
instead with use of std::find()
for detecting collisions.
The reason for that is quite simple, ensuring that node
becomes TriviallyCopyable which reduces re-allocations to a plain memcpy
internally. For up to 16 children or so, this is likely going to be neutral in terms of memory consumption, but still faster.
Upvotes: 1