Reputation: 3451
I have learned how a hashmap is implemented: an array of linked lists, if separate chaining is used. I know that when a key-value pair (key, val)
is inserted, the hashmap inserts the pair {key, val}
in the linked list at entry hash(key) % array.size
in the underlying array.
So, for the insertion process, all that is done is
However, isn't this insertion process O(1)
, assuming the hash is O(1)
, because everything is then O(1)
except possibly the linked list insertion, and for the linked list insertion, couldn't we always choose to insert the pair at the beginning of the linked list, which is O(1)
? Isn't this O(1)
always? If so, then why don't hash tables use this strategy?
Upvotes: 0
Views: 104
Reputation: 6283
For a map where the key is required to be unique (e.g. std::unordered_map
in C++), traversal of the list is still O(1)
in the average case, but the key needs to be searched for first in the bucket. This lookup is never constant time.
Even for a map where the key is not required to be unique (e.g. std::unordered_multimap
in C++), while the insertion could be done strictly in O(1)
by always prepending to each bucket, there is often a requirement that clashes with that, and that is the support for "equal ranges" which still require sorting stuff into the bucket.
So the need for an efficient lookup by key renders the fastest possible insertion method still sub-par for multi-maps too.
Realistically, the only container which can skip all these checks is if the key is guaranteed to be unique by design, but no such check has to be performed at runtime. Only then you can get O(1)
inserts without any pitfalls.
The remaining issue is, you expect not only average O(1)
inserts, but also reads. And for that many hashmaps support dynamic resizing of the hash table, which involves a re-hashing of all already inserted elements in order to get down to reasonable bucket sizes again.
Which then gets you that true O(size())
complexity on inserts most implementations have.
Only hash-maps with a static table size can avoid this.
Upvotes: 1
Reputation: 17372
Typically you don't want duplicate keys in any sort of map, ie when you try to insert a pair (key, value)
you need to check whether that key
already exists in the map or not. For this check you have to
hash(key)
key
already exists in that bucketAnd for checking if the key already exists, you have to iterate the bucket, which can have n
elements. Thus, with buckets impelemented as linked list, the worst-case for insert is still O(n)
There are other implementations, that implement the buckets as balanced trees. With that implementation you can lower the worst-case for insert to O(log n)
Upvotes: 2