SCaffrey
SCaffrey

Reputation: 341

Under what circumstances will std::unordered_map behave very slow?

I've done some random tests but I couldn't come to a conclusion.

If inserting 1000000 integers in to a map and an unordered_map, the time used by map is 3 times larger.

If inserting 1000000 strings then the time used by map is 2 times larger.

Under what circumstances will std::unordered_map behave very slow?

Thanks in advance.

UPD:: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3). All tests got without -O2.

Code:

a.cpp: std::map<int, int> M; b.cpp: std::unordered_map<int, int> M;

g(i, 1, 1000000) {
    M[i] = rand() % i;
}

Results of my tests:

yyhs@yyhs-Pro:~/Documents$ g++ a.cpp -o a -g --std=c++11 && time ./a

real    0m0.659s
user    0m0.653s
sys 0m0.004s
yyhs@yyhs-Pro:~/Documents$ g++ b.cpp -o b -g --std=c++11 && time ./b

real    0m0.260s
user    0m0.251s
sys 0m0.008s

yyhs@yyhs-Pro:~/Documents$ g++ a.cpp -o a -g --std=c++11 -O2 && time ./a

real    0m0.290s
user    0m0.282s
sys 0m0.008s
yyhs@yyhs-Pro:~/Documents$ g++ b.cpp -o b -g --std=c++11 -O2 && time ./b

real    0m0.081s
user    0m0.081s
sys 0m0.000s

My question here is what cases might cause std::unordered_map become slow.

Upvotes: 1

Views: 548

Answers (1)

Dietrich Epp
Dietrich Epp

Reputation: 213268

As usual, this will depend on the particular implementation, but this is not entirely true and the standard guarantees that std::unordered_map will outperform std::map asymptotically. It is only the constant factors that will vary from implementation to implementation. The insertion time for std::map is O(log N) and the average insertion time for std::unordered_map is O(1). See §23.4.4.1 and §23.5.4 in n3690 for details.

In general, std::unordered_map will outperform std::map by a large margin (as you observed) unless you have a lot of collisions. You can create collisions by choosing keys that get placed in the same bucket. This requires knowledge of your hash function and the mapping from hash values to buckets, but this knowledge can be used by attackers to make your program slow if they can control the keys in your hash table. For this reason, it is common to use randomized hash functions in exposed applications.

In pathological cases, std::map can outperform std::unordered_map if your hash function is chosen poorly (either very slow to evaluate or producing many collisions). This is extremely atypical.

As a minor note, the standard library std::unordered_map tends to be an open hash table in order to meet the C++ standard's requirements with respect to iterator behavior. This is known to be far from optimal for many applications, and there are a number of alternative hash table libraries out there which perform even better.

Upvotes: 3

Related Questions