Reputation: 62519
I have a method i wrote to find the duplicates in a List. It works fine but im concerned about the complexity of using containsKey. When we use containsKey we have to compute a hash function for every key and then compare against each with our search item, right ? So wouldn't the complexity be O(n) ?
Here is the function:
public void findDup(List<String> list){
HashMap<String,Integer> map = new HashMap<>();
int pos=0;
for(String s: list){
if(map.containsKey(s)){
Log.v("myapp","duplicate found:"+s);
}
else
map.put(s,pos);
pos++;
}
}
and to call it i do this:
List<String>list=new ArrayList<>();
for(int i=0;i<12;i++)
list.add(i+"");
//these numbers should surely be duplicates
list.add("3");list.add("6");
findDup(list);
//output will be 3 and 6 clearly.
update: i re-wrote the function to just use a set which makes more sense:
public void findDup(List<Integer> list){
HashSet<Integer> set = new HashSet<>();
for(Integer num: list){
if(!set.add(num)){
Log.v("myapp","duplicate found:"+num);
}
}
}
Upvotes: 3
Views: 10677
Reputation: 43088
When we use containsKey we have to compute a hash function for every key and then compare against each with our search item, right ?
Wrong.
When the hash function gives a value, this value is used to index into the table directly. There is no further comparisons, unless collisions occur in which case it may or may not even compare depending on the collision resolution method that is used.
Since you are simply checking for duplicates, a hashset is a better data structure to use
Upvotes: 1
Reputation: 310893
It is specified in the Javadoc as O(1).
The complexity of your algorithm is therefore O(N).
But it would be that even without the containsKey()
call, which is actually unnecessary. All you have to do is test whether put()
returns a non-null value, which indicates a duplicate.
When we use containsKey we have to compute a hash function for every key and then compare against each with our search item, right?
Wrong. We compute the hash value of the search key and check if that bucket is occupied by an equal key.
So wouldn't the complexity be O(n) ?
No.
Upvotes: 7
Reputation: 3631
The answer is specified in the documentation.
This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among >the buckets.
Since containsKey()
is just a get() that throws away the retrieved value, it's O(1) (assuming the hash function works properly).
FYI containsValue
is O(n), because without the key it doesn't know where it is and the algorithm has to go over all the values stored in the map.
Upvotes: 3
Reputation: 726569
So wouldn't the complexity be O(n)?
Yes, the complexity for the entire list is going to be O(n). However, you do not need to use HashMap<K,V>
because HashSet<Key>
would be sufficient for finding duplicates:
Set<String> seen = new HashSet<>();
for(String s: list){
if(!seen.add(s)){
Log.v("myapp","duplicate found: "+s);
}
}
Upvotes: 3