Flo Ryan
Flo Ryan

Reputation: 443

std::unordered_map<> bucket_count() after default rehash

If i keep adding values to an unordered_map then every time the number of elements exceeds the bucket_count() (assuming max_load_factor = 1) rehashing takes place.

What I am very confused about is the bucket size after rehashing.

#include <iostream>
#include <unordered_map>
int main() {
   std::unordered_map<size_t, size_t> mp;

   for (size_t i = 0; i < 1000; ++i) {
      mp[i] = i;
      std::cout << " count: " << mp.bucket_count() << std::endl;
   }
}

This outputs 3 7 17 37 79 167 337 709 1493

I have noticed that the bucket size is prime and approximately doubles. However it is not the closest prime to the next power of 2 either.

What is the methodology behind this bucket size increase. I was surprised or stupid enough that I couldn't find anything about it in standard references such as cplusplus.com

Upvotes: 2

Views: 288

Answers (1)

Madows
Madows

Reputation: 11

The bucket sizes after rehashing will be compiler specific. The specific numbers you are seeing can be found by taking the current bucket size, multiplying by 2, and then taking the nearest prime larger than that value from the __prime_list array here: https://github.com/gcc-mirror/gcc/blob/5bea0e90e58d971cf3e67f784a116d81a20b927a/libstdc%2B%2B-v3/src/shared/hashtable-aux.cc

Upvotes: 1

Related Questions