willem
willem

Reputation: 2717

Role of std::unordered_set::reserve for container memory requirement?

Suppose I use a std::unordered_set<MyClass>, and suppose sizeof(MyClass) is large, i.e. much larger than sizeof(size_t) and sizeof(void*). I will add a large number numberOfElementsToBeAdded of elements to the unordered set.

From https://stackoverflow.com/a/25438497/4237 , I find: "each memory allocation may be rounded up to a size convenient for the memory allocation library's management - e.g. the next power of two, which approaches 100% worst case inefficiency and 50% average, so let's add 50%, just for the list nodes as the buckets are likely powers of two given size_t and a pointer: 50% * size() * (sizeof(void*) + sizeof((M::value_type))"

This drives me to conclude that actual memory consumption will be between 1*numberOfElements*sizeof(MyClass) and (1+1)*numberOfElements*sizeof(MyClass), modulo some additional memory, which is of little interest here because it is of the order sizeof(size_t).

However, I know the number of elements to be added in advance, so I call:

std::unordered_set<MyClass> set;
set.reserve(numberOfElementsToBeAdded);
//Insert elements

Thinking about the parallel of calling std::vector::reserve, I would guess that this avoids the potential overhead, and would thus guarantee memory consumption to be approximately 1*numberOfElements*sizeof(MyClass) (modulo some additional memory, again of order sizeof(size_t)).

Could I rely on implementations of the standard library to guarantee that calling reserve like this will avoid the 0-100% (50% average) overhead mentioned in the above answer?

Upvotes: 2

Views: 1156

Answers (2)

Some programmer dude
Some programmer dude

Reputation: 409166

From this std::unordered_set::reserve reference

Sets the number of buckets to the number needed to accomodate at least count elements ...

There's some more, but the gist is that std::unordered_set::reserve doesn't actually work like std::vector::reserve. For an unordered set it allocates buckets for the hash table, not memory for the actual elements themselves.

The set could put one element per bucket, but it's not guaranteed.

Upvotes: 3

Davide Spataro
Davide Spataro

Reputation: 7482

Quoting the standard:

Sets the number of buckets to the number needed to accomodate at least count elements without exceeding maximum load factor and rehashes the container, i.e. puts the elements into appropriate buckets considering that total number of buckets has changed

The important part is **at least*. I do not think you can rely on the implementation and be sure that it only uses less amount of memory. My guess is that regardless if you call reserve the memory usage for numElements will be similar.

Upvotes: 0

Related Questions