Reputation: 21
From what I understand, the time complexity to iterate through a Hash table with capacity "m" and number of entries "n" is O(n+m). I was wondering, intuitively, why this is the case? For instance, why isn't it n*m?
Thanks in advance!
Upvotes: 1
Views: 5449
Reputation: 168
It is correct that it should be O(m+n). It would be more practical though to also have an estimate of the number of buckets in terms of the number of elements in the HashMap.
Java's implementation of HashMap allows you to construct the HashMap with a given load factor (LF). This makes sure that the number of buckets cannot grow too fast and makes sure that m = O(n). The way to see it would be to stop a specific capacity m_0 and observe that when n reaches n = LF*m_0, m will be doubled. From then on, till the next doubling happens, n > (LF/2)*m.
What this means is that the complexity O(m+n) is also O(n).
Upvotes: 0
Reputation: 1
Iteration over collection views requires time proportional to the "capacity" of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings) n = the number of buckets m = the number of key-value mappings
The complexity of a Hashmap is O(n+m) because the worst-case scenario is one array element contains the whole linked list, which could occur due to flawed implementation of hashcode function being used as a key. Visialise the worst-case scenario To iterate this scenario, first java needs to iterate the complete array O(n) and then iterate the linked list O(m), combining these gives us O(n+m).
Upvotes: 0
Reputation: 34470
You are absolutely correct. Iterating a HashMap
is an O(n + m)
operation, with n
being the number of elements contained in the HashMap
and m
being its capacity. Actually, this is clearly stated in the docs:
Iteration over collection views requires time proportional to the "capacity" of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings).
Intuitively (and conceptually), this is because a HashMap
consists of an array of buckets, with each element of the array pointing to either nothing (i.e. to null
), or to a list of entries.
So if the array of buckets has size m
, and if there are n
entries in the map in total (I mean, n
entries scattered throughout all the lists hanging from some bucket), then, iterating the HashMap
is done by visiting each bucket, and, for buckets that have a list with entries, visiting each entry in the list. As there are m
buckets and n
elements in total, iteration is O(m + n)
.
Note that this is not the case for all hash table implementations. For example, LinkedHashMap
is like a HashMap
, except that it also has all its entries connected in a doubly-linked list fashion (to preserve either insertion or access order). If you are to iterate a LinkedHashMap
, there's no need to visit each bucket. It would be enough to just visit the very first entry and then follow its link to the next entry, and then proceed to the next one, etc, and so on until the last entry. Thus, iterating a LinkedHashMap
is just O(n)
, with n
being the total number of entries.
Upvotes: 3