Reputation: 5166
I have a doubt regarding data structures in Java. While solving a typical hashing problem in Java, I was using the HashSet
data structure, which worked fine until there were duplicate objects (object contents). Since HashSet does not support insert of duplicates, my logic was failing.
I replaced the hashset with the typical Arraylist
since the methods of hashset such as .add(), .contains(), .remove() are supported in both, and my logic worked perfectly then.
But does this necessarily mean ArrayList is the logical choice over Hashset when duplicates are involved? There should be some time complexity advantages of Hashset over ArrayList right? Can someone please provide me some insight regarding this?
EDIT: What would be the ideal data structure when you want to do hashing when duplicates are involved. I mean when the duplicates should not be ignored and should be inserted.
Upvotes: 1
Views: 4651
Reputation: 7032
If you specifically need a HashSet
that handles duplicates, a HashMap
will be able to do the job. If you just need a count of the number of objects added (with quick lookup/etc), a HashMap<T,Integer>
will be ideal, where T
is the type of your object. If you actually need to keep references to the duplicate objects you've added, go with HashMap<T, List<T>>
. That way you can look up by using HashMap's .containsKey(T t)
, and iterate through all of the similarly hashing objects in the resulting list. So for example, you could create this class:
public class HashSetWithDuplicates<T> {
private HashMap<T, List<T>> entries;
private int size;
public HashSetWithDuplicates(){
entries = new HashMap<>();
size = 0;
}
public HashSetWithDuplicates(Collection<? extends T> col){
this();
for(T t : col){
add(t);
}
}
public boolean contains(T t){
return entries.containsKey(t);
}
public List<T> get(T t){
return entries.get(t);
}
public void add(T t){
if (!contains(t)) entries.put(t, new ArrayList<>());
entries.get(t).add(t);
size++;
}
public void remove(T t){
if (!contains(t)) return;
entries.get(t).remove(t);
if(entries.get(t).isEmpty()) entries.remove(t);
size--;
}
public int size(){
return size;
}
public boolean isEmpty(){
return size() == 0;
}
}
Add functionality to your needs.
Upvotes: 0
Reputation: 7579
It's not clear what you mean by a "hashing problem," but maybe you're looking for a multiset. From the Guava docs:
A collection that supports order-independent equality, like Set, but may have duplicate elements. A multiset is also sometimes called a bag.
Elements of a multiset that are equal to one another are referred to as occurrences of the same single element. The total number of occurrences of an element in a multiset is called the count of that element (the terms "frequency" and "multiplicity" are equivalent, but not used in this API).
No such thing exists in the JDK.
Upvotes: 4
Reputation: 15146
ArrayList
is not the logical choice if you don't want duplicates. Different tools for different use cases.
You would use a Set
in areas where duplicates wouldn't make sense, for example, a set of students. A List
allows duplicates.
Upvotes: 2
Reputation: 1506
It all depended on what you need given your requirements.
Upvotes: 2