Reputation: 2546
Suppose the sizeof MyClass
is N
Bytes, and I put M
objects of MyClass
into a std::set<MyClass>
. The theoretic memory cost is N*M
Bytes, however as I know std::set
is implemented using Tree structure so I reckon there must be extra memory cost. How can I estimate an approximate and reasonable memory cost for a set
, because when I run my program, the memory cost is higher than what I expected?
Upvotes: 2
Views: 416
Reputation: 153810
The ordered associative containers have constraints which pretty much require that they are implemented as balanced tree. Most likely the nodes in each tree will have two pointers to the children and a pointer to the parent node. Depending on whether the inplementation plays bit tricks to store additional information in the pointers or not there may be an extra word in the allocated nodes. Further, the memory management system may store a word for each allocated chunk of memory and align it, e.g., to a 16 byte boundary.
What the implementation actually does is certainly not specified in the standard but I'd think the minimum memory consumption for M
elements of size N
is
M * (N + 2 * sizeof(void*) + sizeof(int))
It is probably somewhat more: I would probably use a pointer to the parent, too.
Upvotes: 1
Reputation: 299760
As mentioned by @SirDarius, this is implementation dependent and therefore you would need to check on a per implementation basis.
Usually, std::set
is implemented as a red-black tree with lean iterators. This means that for each element in the set, there is one node, and this node roughly contains:
The minimal foot-print of such a node on a 64 bits platform is therefore footprint(node)
:
sizeof(MyClass)
, rounded to 8 bytes (for alignment reasons: look up "struct padding")Note: it is easy to use an unused bit somewhere to stash the red/black flag at no additional cost, for example taking into account the fact that given node alignments the least-significant bits of the pointers are always 0
, or inserting in MyClass
padding if any.
However, when you allocate an object T
via new
(which is what std::allocator
does under the hood, and which itself uses malloc
unless overloaded), you do not allocate just sizeof(T)
bytes:
malloc
uses size buckets, meaning that if you have 8 bytes buckets and 16 bytes buckets when you allocate a 9 bytes object it goes into a 16 bytes bucket (wasting 7 bytes)Therefore, indeed, the use of a set
will most likely consume more than just set.size() * sizeof(MyClass)
, though the overhead depends on your Standard Library implementation. This overhead will likely be large for small objects (such as int32_t
: on a 64-bits platform you are looking at least at a +500% overhead) because the cost of linking the nodes together is usually fixed, whereas for large objects it might not be noticeable.
In order to reduce the overhead, you traditionally use arrays. This is what a B-Tree does for example, though it does so at the cost of stability (ie, each element staying at a fixed memory address regardless of what happens to the others).
Upvotes: 4