Reputation: 120
If hashtables/maps with closed hashing are worst-case O(n)
, are HashSets also going to require O(n)
time for lookup, or is it constant time?
Upvotes: 2
Views: 2603
Reputation: 163
I see a lot of people saying the worst case is O(n). This is because the old HashSet implementation used to use a LinkedList to handle collisions to the same bucket. However, that is not a definitive answer.
In java 8 such LinkedList is replaced by a balanced binary tree when the number of collisions of a bucket grows. This improves the worst-case performance from O(n) to O(log n) for lookups.
You can check additional details here.
Upvotes: 0
Reputation: 474
Worst case is O(N) as mentioned, average and amortized run time is constant.
From GeeksForGeeks: The underlying data structure for HashSet is hashtable. So amortize (average or usual case) time complexity for add, remove and look-up (contains method) operation of HashSet takes O(1) time.
Upvotes: 0
Reputation: 19623
If you look at the implementation of a HashSet
(e.g. from OpenJDK 8: https://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/HashSet.java), you can see that it's actually just built on top of a HashMap
. Relevant code snippet here:
public class HashSet<E>
extends AbstractSet<E>
implements Set<E>, Cloneable, java.io.Serializable
{
private transient HashMap<E,Object> map;
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
/**
* Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
* default initial capacity (16) and load factor (0.75).
*/
public HashSet() {
map = new HashMap<>();
}
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
The HashSet
attempts to slightly optimize the memory usage by creating a single static empty Object value named PRESENT
and just using that as the value part of every key/value entry into the HashMap
.
So whatever the performance implications are of using a HashMap
, a HashSet
will have more or less the same ones since it's literally using a HashMap
under the covers.
To directly answer your question: in the worst case, yes, just as a the worse case complexity of a HashMap
is O(n)
, so too the worst case complexity of a HashSet
is O(n)
.
It is worth noting that, unless you have a really bad hash function or are using a hashtable of a ridiculously small size, you're very unlikely to see the worst case performance in practice. You'd have to have every element hash to the exact same bucket in the hashtable so the performance would essentially degrade to a linked list traversal (assuming a hashtable using chaining for collision handling, which the Java ones do).
Upvotes: 1
Reputation: 312219
When looking up an element in a HashMap
, it performs an O(1) calculation to find the right bucket, and then iterates over the items there serially until it finds the one the is equal to the requested key, or all the items are checked.
In the worst case scenario, all the items in the map have the same hash code and are therefore stored in the same bucket. In this case, you'll need to iterate over all of them serially, which would be an O(n) operation.
A HashSet
is just a HashMap
where you don't care about the values, only the keys - under the hood, it's a HashMap
where all the values are a dummy Object
.
Upvotes: 5