Reputation: 1238
I'm wondering if there's any guarantee that the order of an unordered_map will always be the same across all CPUs, threads, etc.
I realize there may be no obvious pattern to the particular order itself (hence, 'unordered' map), but if I run my process on another machine, or multiple times in succession, or on different threads, will the order of inserted items always be the same if the hash functions and insertion order remains the same? In other words, if my code doesn't change, will every execution of my process result in the elements of the map being in the same order?
I've run a few tests, and the order of items after insertion seems to be the same each time, but that could merely be a fluke, and I only have this single machine to test on. I need to know whether the order can be influenced by any other factors such as CPU/memory architecture, OS (Windows 8 vs Windows 10), etc.
Upvotes: 4
Views: 4295
Reputation: 21
I got the different output orders when running the following code in different hardware platforms.
#include <iostream>
#include <set>
#include <unordered_set>
#include <vector>
#include <functional>
#include <string>
struct hash_func {
std::size_t operator()(const std::string &key) const {
std::size_t hash = 0U;
const std::size_t mask = 0xF0000000;
for (std::string::size_type i = 0; i < key.length(); ++i) {
hash = (hash << 4U) + key[i];
std::size_t x = hash & mask;
if (x != 0)
hash ^= (x >> 24);
hash &= ~x;
}
std::cout << "for key " << key << " hash " << hash << std::endl;
return hash;
}
};
void show() {
std::vector<std::string> v;
v.push_back("n1YxyaBzoRogNh72eri7HBGijCAtcHpf9nm,");
v.push_back("n1GV9UScgncwU6KKL9T18mCo2S6uAE69SWs,");
v.push_back("n1X2SXyEKej7GZgAreXDCkiT59qaYKBDcYi,");
std::unordered_set<std::string, hash_func> s;
for (auto it = v.begin(); it != v.end(); it++) {
s.insert(*it);
}
for (auto it = s.begin(); it != s.end(); it++) {
std::cout << *it << std::endl;
}
}
int main() {
show();
return 0;
}
The results are:
1) Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
for key n1YxyaBzoRogNh72eri7HBGijCAtcHpf9nm, hash 43411516
for key n1GV9UScgncwU6KKL9T18mCo2S6uAE69SWs, hash 93983804
for key n1X2SXyEKej7GZgAreXDCkiT59qaYKBDcYi, hash 17984604
n1X2SXyEKej7GZgAreXDCkiT59qaYKBDcYi,
n1GV9UScgncwU6KKL9T18mCo2S6uAE69SWs,
n1YxyaBzoRogNh72eri7HBGijCAtcHpf9nm,
2) AMD EPYC 7501 32-Core Processor
for key n1YxyaBzoRogNh72eri7HBGijCAtcHpf9nm, hash 43411516
for key n1GV9UScgncwU6KKL9T18mCo2S6uAE69SWs, hash 93983804
for key n1X2SXyEKej7GZgAreXDCkiT59qaYKBDcYi, hash 17984604
n1X2SXyEKej7GZgAreXDCkiT59qaYKBDcYi,
n1YxyaBzoRogNh72eri7HBGijCAtcHpf9nm,
n1GV9UScgncwU6KKL9T18mCo2S6uAE69SWs,
Upvotes: 2
Reputation: 9113
TL;DR: It can be done, but I wouldn't recommend it. Use another data structure if you can; your own hash table, a "treap", a flat array, or something.
The order of items in an std::unordered_map
(or set) is more dependent on the standard library implementation than hardware/CPU/etc.
For that reason, if you use the same library implementation across different hardware, and provide your own hash function (to make sure it's not randomized across runs - to combat DoS attacks against your data structures,) or otherwise hardware- or OS-dependent, then you should be fine.
However, if you are looking for a guarantee in the standard, you won't find any. The only relevant guarantee is that in the same instance of the object, the same key will hash to the same bucket. I don't think there is a guarantee even for different instances of the map, and I know from (painful) personal experience that there is no consistency across different runs of the application.
But not all hope is lost! If you stick to the same implementation of unordered_map
, and use your own hash functions, and take a look at the implementation to make sure there are no hidden surprises (any dependency on hardware/OS/time/RNG/etc. should be relatively easy to spot) you can manage it.
Note that since you seem to be on Windows and probably using MSVC, the default unordered_map
with the default hashing algorithm is not at all order-consistent across runs of the same compiled binary (at least it wasn't in 2013/2015 IIRC.)
Another thing to keep in mind is that if you are serious about consistency, you have to make sure you link against the CRT statically. If you link against the DLL version, some future patch/update can change the behavior for your application after you release it.
Upvotes: 7