Reputation: 756
The instructor in this video explains that hash map implementations usually contain a linked list to chain values in case of collisions. My question is: Why not use something like an AVL tree (that takes O(log n) for insertions, deletions and lookups), instead of a linked list (that has a worst case lookup of O(n))?
I understand that hash functions should be designed such that collisions would be rare. But why not implement AVL trees anyway to optimize those rare cases?
Upvotes: 1
Views: 1981
Reputation: 465
It depends of the language implementing HashMap. I dont think this is a strict rule.
For example in Java: What your video says is true up to Java 7. In Java 8, the implementation of HashMap was changed to make use of red-black trees once the bucket grows beyond a certain point.
If your number of elements in the bucket is less than 8, it uses a singly linked list. Once it grows larger than 8 it becomes a tree. And reverts back to a singly linked list once it shrinks back to 6.
Why not just use a tree all the time? I guess this is a tradeoff between memory footprint vs lookup complexity within the bucket. Keep in mind that most hash functions will yield very few collisions, so maintaining a tree for buckets that have a size of 3 or 4 would be much more expensive for no good reason.
For reference, this is the Java 8 impl of an HashMap (and it actually has a quite good explanation about how the whole thing works, and why they chose 8 and 6, as "TREEIFY" and "UNTREEIFY" threshold) : http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/java/util/HashMap.java?av=f
And in Java 7: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/java/util/HashMap.java?av=f
Upvotes: 3