user2485710
user2485710

Reputation: 9811

Why storing a tree as a contiguous chunk of memory?

I just discovered that there are some tree based data structure that, when looking for high performance, are often times stored as a contiguous chunk of memory, this is especially popular when using the so called "policy based data structure" .

The problem is that I can't wrap my head around why one would like to do just that; when you try to "linearize" a tree to store it as a vector/array, how you make sure that you are re-arranging the branches and the leafs in a meaningful way that helps performance ? Is this ok only for perfectly balanced trees ?

In another words I can't imagine the pattern used for access a linear data structure that spans on multiple levels and has multiple leafs; usually a tree add 1 level of indirection for each node/leaf, and this simplifies things a lot for the user, but how one should organize such "linear" tree ?

Upvotes: 8

Views: 8428

Answers (6)

ForguesR
ForguesR

Reputation: 3618

It is important to differentiate a tree and an AVL tree. In your question you talk about balancing the tree so your question is probably about an AVL tree representation in an array.

All the other answers talk about a tree and not an AVL tree. As far as I know, this kind of tree can be represented in an array but cannot be efficiently updated since you have to reorder many elements of the array instead of playing with memory pointers.

That means you can represent a perfectly ordered balanced tree in an array as long as the input elements are already sorted. This tree will be faster than a usual memory tree but updating it will be "harder".

My conclusion would be :

  • read-only tree with lot of access => in an array
  • updatable tree with lot of changes => in memory

Upvotes: 1

Mooing Duck
Mooing Duck

Reputation: 66961

There are actually many such patterns, who have two purposes: saving memory, and keeping nodes togeather, primarily for paging performance.

One very simple version is simply to allocate blocks of three, a parent and two children, and this block has four "child" blocks, two for each child. This cuts your allocations by a third. This isn't much of an optimization, until you scale it up, allocations of 7, 15, 31, 63... if you can get it so that as many keys as possible fit onto a single memory system page, then you minimize the time spent waiting for the hard drive. If your keys are each 32 bytes, and a page is 4K, then you can store up to 125 keys, which means you only have to load one page from the hard drive for each 7 rows of the tree. At that point, you load the "child" page, and then follow another 7 rows. A regular binary tree may only have one node per page, which means you spend 7x as much time simply waiting for the harddrive as you iterate the tree. Quite slow. Rotates are a little tricky as you have to actually swap the data, not the pointers as is common with tree implementations. Also, it has a waste to use a LOT of space when the tree becomes larger.

               ---------
               |   O   |
               |  / \  |
               | O   O |
              _---------_
           __/    \ /    \__
          /       | |       \
--------- --------- --------- ---------
|   O   | |   O   | |   O   | |   O   |
|  / \  | |  / \  | |  / \  | |  / \  |
| O   O | | O   O | | O   O | | O   O |
--------- --------- --------- ---------

Another far more complex "pattern" is to cut the tree in half vertically, so the top is one "subtree", and it has many children "subtrees", and store each "subtree" linearly. And you repeat this recursively. This is a very strange pattern, but ends up working vaguely similar to the above pattern, except it's "cache-oblivious", which means it works with any page size, or cache hierarchy. Quite cool, but they're complex, and virtually everything runs on one of three well known architectures, so they aren't popular. They're also extremely difficult to insert/remove from

Another very simple variant is to put the whole tree into an array accessed via indecies, which saves total memory, but only the top is cache friendly, lower levels are worse cache wise than a regular binary tree. Effectively, the root is at index i=0, and the left child is at (n*2+1 = 1), and the right child is at (n*2+2 = 2). If we're at a node at index 24, it's parent is ((n-1)/2 = 12), and it's left and right children are 49 and 50 respectively. This works great for small trees, because it requires no overhead for pointers whatsoever, The data is stored as a continuous array of values, and the relationships are inferred by the index. Also, adding and removing children always happens near the right end, and normal binary tree insertion/rotation/erasure applies. These also have an interesting mathematical novelty, that if you convert the index plus one to binary, that corresponds with the location in the tree. If we're thinking about the node at index 24, 24+1 in binary is 11001 -> The first 1 always means the root, and from there each 1 means "go right" and each 0 means "go left", which means to get to index 24 from the root you go right, left, left, right, and you're there. Additionally, since there's 5 binary digits, you know it's in the fifth row. Neither of these observations are particularly useful, other than they sort of imply that the root node is a right child, which is vaguely interesting. (If you expand to other-bases, the root is alway the rightmost child). That being said, it's still often useful to implement the root as a left node if you work with bidirectional iterators.

     0
    / \
   /   \
  1     2
 / \   / \
3   4 5   6

[0][1][2][3][4][5][6]

Upvotes: 7

2785528
2785528

Reputation: 5576

" ... try to "linearize" a tree to store it as a vector/array, how you make sure that you are re-arranging the branches and the leafs in a meaningful way that helps performance ..."

I believe you are thinking too hard.

In a normal tree, you use 'new' to request free space in which to create a node.

You use delete to return the no longer needed space to the heap.

Then connect the nodes using pointers.


For a 'tree-in-a-vector', you might might simply re-implement your new and delete to find space in a vector.

I think instead of pointers (to parent, or left or right node) you use the index (to parent, or left or right node).

I believe the index of the nth item in a vector (both before and after the re-allocate for growth) is unchanged.

Another challenge is the delete of a node ... but this can be as simple as any node (or index) greater than the erased node reduces by 1.


These choices can be a fair trade for a tree that changes seldom, but must be captured as quickly as possible.

Does storing your tree really require the performance of a vector block save? Is the vector block save actually faster than depth-first save of the same tree. You really need to measure.

Upvotes: 1

Ryan J
Ryan J

Reputation: 8323

You might find the short article here interesting

Basically, the reasoning behind using a contiguous block of memory for such a structure is it greatly improves lookup and scan times when dealing with potentially large sets of data. If your memory is non-contiguous, you may have to employ costly traversal algorithms to retrieve data from the data structure.

Hopefully this addresses your interests.

Here are the two images from the article that illustrate this concept:

A balanced tree

A balanced tree

The tree as stored in contiguous memory:

enter image description here

Upvotes: 8

Jeremy Friesner
Jeremy Friesner

Reputation: 73279

how you make sure that you are re-arranging the branches and the leafs in a meaningful way that helps performance?

If you have the program running already (with a non-contiguous tree), you could always just instrument your program to report what its actual node-access patterns typically look like. Once you have a good idea of how the nodes are accessed, you could then customize your node-allocator to allocate the nodes in memory in that same order.

Upvotes: 2

Thomas Matthews
Thomas Matthews

Reputation: 57749

Storing data structures in contiguous memory is a technique used on memory constrained systems, such as embedded systems. The technique may also be used on safety and performance critical systems.

The desktop systems usually have a lot of memory and their applications are short lived. Their process of dynamic memory allocation is to find the next available block in the memory pool and return it. If no memory is available (such as in fragmentation), the allocation fails. There is no control on how much memory can be consumed.

By having a contiguous allocation method, the amount of nodes created can be restricted or limited. This means in a system with 32k of memory, the tree won't use up all the memory and leave holes.

The allocation process is faster using a contiguous system. You know where the blocks are. Also, instead of storing pointers for the link, index values can be stored. This also allows for storing the tree into a file and retrieving easily.

You can model this by creating an array or vector of nodes. Change your node data structure to use array indices instead of pointers.

Remember, the only way to know about performance issues is to profile.

Upvotes: 4

Related Questions