Reputation: 167
I have created a map of unit64_t to uint64_t. This is the code I have written for assesing the space complexity:
#include <bits/stdc++.h>
#include "sparsehash/internal/sparseconfig.h"
#include "sparsehash/sparse_hash_map"
using namespace std;
int main(int argc, char *argv[]){
std::string input,reference;
while (getline(cin,input)) {
reference += input;
input.clear();
}
cout<<"length of reference = "<<reference.length()<<endl;
unordered_map<uint64_t, uint64_t> m;
//google::sparse_hash_map<uint64_t, pair<short,long>> m;
for (auto it = reference.begin(); it != reference.end(); it++) {
m[it-reference.begin()]= it-reference.begin();
}
return 0;
}
When I run this with /usr/bin/time , this is the output produced by the program:
length of reference = 4641652
Command being timed: "./a.out"
User time (seconds): 2.97
System time (seconds): 0.15
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.13
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 251816
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 68259
Voluntary context switches: 1
Involuntary context switches: 104
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
The unordered map seems to take up 250MB of space. This seems to be unusually high. Why would this happen. The same code with google sparse hash takes only 89MB of space, which is more reasonable.
I don't understand why the C++ unordered map take up so much space?
Upvotes: 2
Views: 2040
Reputation: 39
Sum of Two Values
This problem from CSES problem set, Gives AC with using std::map<int, int>
and gives TLE with std::unordered_map<int, int>
. so definitely hashmap takes larger memory
Upvotes: 0
Reputation: 3565
You have 4641652
entries. So the total raw data size is 4641652*2*8 byte ~= 74 MB
.
There's a important fact about hash-tables. Fast hash tables have lots of hash buckets, and hash-tables with a small number of hash buckets are slow.
It basically all comes down to hash-collisions. If you have many hash buckets (and you have a good hashing function), than hash-collisions happen rarely. Therefore the lookup is really really fast. On the otherside if your table is small (not many hash buckets), than hash-collisions happen regularly. Thus the lookup function is a lot slower.
Now std::unordered_map
is manly desiged as a fast hash-table, so it has quite a lot overhead. There are much more hash buckets than entries. In this case the overhead is about 250 / 74 ~= 3.3x
, which seems to be quite normal.
But sparsehash
is designed to have as little overhead as possible (about ~2 bit per entry). But of course this means that it is quite a lot slower.
If your using a hash-map, you should always think about, if you want speed or you want to be memory efficient.
Upvotes: 5