Reputation: 3077
I understand the unordered_
stl containers keep a number of buckets that vary in number depending on the number of elements in the container. When inserting, if breaking a certain limit, the container will rehash to use more buckets, so each bucket is less full and faster to search. And that invalidates the iterators.
This means I shouldn't keep iterators to an unordered
container. Except I can, if I update them after a rehash. But I couldn't find a reliable way to check whether an insert
(be it emplace
or whatever) caused a rehash.
Should I monitor bucket_count()
?
cppreference says Rehashing occurs only if the new number of elements is greater than max_load_factor()*bucket_count()
. Is that garanteed? Would it be reliable to do the following?
bool will_rehash = (max_load_factor()*bucket_count()) > size()+1;
Upvotes: 6
Views: 2009
Reputation: 1293
I don't think that re-hashing (as the process where actually a hash function is engaged) takes place once hash-map is growing:
That means, that one may watch over bucket count to infer whether iterator should be invalidated (predicated the invalidation happens at the moment of re-bucketing)
#include <iostream>
#include <string>
#include <unordered_map>
using namespace std;
typedef unordered_map<string, string> str_map;
struct my_hash {
void reset(void) { calls_ = 0; }
size_t calls(void) { return calls_; }
size_t operator()(const string& key) const {
++my_hash::calls_;
return hash_fn_(key);
}
private:
str_map native_hash_fn_;
str_map::hasher hash_fn_{native_hash_fn_.hash_function()};
static size_t calls_;
};
size_t my_hash::calls_ = 0;
int main ()
{
typedef unordered_map<string, string, my_hash> myMapType;
myMapType mymap(1, my_hash());
myMapType::hasher hash = mymap.hash_function();
cout << "mymap has " << mymap.bucket_count() << " buckets" << endl;
mymap["abc1"] = "blah"; // add 3 values
mymap["abc2"] = "blah"; // just to see the hash calls tracking
mymap["abc3"] = "blah"; // is working
cout << "hash calls: " << hash.calls() << endl;
hash.reset();
for(size_t i=0; i < 10; ++i) {
mymap[to_string(i)] = "blah";
cout << "buckets: " << mymap.bucket_count() << endl;
cout << "hash calls: " << hash.calls() << endl;
hash.reset();
}
cout << "mymap has " << mymap.bucket_count() << " buckets" << endl;
}
Output:
mymap has 2 buckets
hash calls: 3
buckets: 5
hash calls: 1
buckets: 11
hash calls: 1
buckets: 11
hash calls: 1
buckets: 11
hash calls: 1
buckets: 11
hash calls: 1
buckets: 11
hash calls: 1
buckets: 11
hash calls: 1
buckets: 23
hash calls: 1
buckets: 23
hash calls: 1
buckets: 23
hash calls: 1
mymap has 23 buckets
Disclaimer: though, I strongly believe that it's not a good programming practice to refer iterators after the container has changed in size. Even if it might not cause some fatal / undefined behaviors, it might (and most likely will) cause some side effects onto the programming logic. In case with the hash-map, consider a situation with a begin()
iterator: after a few insertions/emplacements it won't be a true begin
anymore. Even if re-bucketing did not occur, some new entries might get installed in front of it (which might affect the programming logic).
Upvotes: 1