Reputation: 31020
Update:
I guess HashSet.add(Object obj)
does not call contains
. is there a way to implement what I want(remove dup strings ignore case using Set
)?
Original question:
trying to remove dups from a list of String in java, however in the following code CaseInsensitiveSet.contains(Object ob)
is not getting called, why?
public static List<String> removeDupList(List<String>list, boolean ignoreCase){
Set<String> set = (ignoreCase?new CaseInsensitiveSet():new LinkedHashSet<String>());
set.addAll(list);
List<String> res = new Vector<String>(set);
return res;
}
public class CaseInsensitiveSet extends LinkedHashSet<String>{
@Override
public boolean contains(Object obj){
//this not getting called.
if(obj instanceof String){
return super.contains(((String)obj).toLowerCase());
}
return super.contains(obj);
}
}
Upvotes: 3
Views: 13150
Reputation: 16335
add()
method of LinkedHashSet
do not call contains()
internally else your method would have been called as well.
Instead of a LinkedHashSet
, why dont you use a SortedSet
with a case insensitive comparator
? With the String.CASE_INSENSITIVE_ORDER comparator
Your code is reduced to
public static List<String> removeDupList(List<String>list, boolean ignoreCase){
Set<String> set = (ignoreCase?new TreeSet<String>(String.CASE_INSENSITIVE_ORDER):new LinkedHashSet<String>());
set.addAll(list);
List<String> res = new ArrayList<String>(set);
return res;
}
If you wish to preserve the Order, as @tom anderson specified in his comment, you can use an auxiliary LinkedHashSet for the order.
You can try adding that element to TreeSet, if it returns true also add it to LinkedHashSet else not.
public static List<String> removeDupList(List<String>list){
Set<String> sortedSet = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
List<String> orderedList = new ArrayList<String>();
for(String str : list){
if(sortedSet.add(str)){ // add returns true, if it is not present already else false
orderedList.add(str);
}
}
return orderedList;
}
Upvotes: 3
Reputation: 136002
Try
Set set = new TreeSet(String.CASE_INSENSITIVE_ORDER);
set.addAll(list);
return new ArrayList(set);
UPDATE but as Tom Anderson mentioned it does not preserve the initial order, if this is really an issue try
Set<String> set = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
Iterator<String> i = list.iterator();
while (i.hasNext()) {
String s = i.next();
if (set.contains(s)) {
i.remove();
}
else {
set.add(s);
}
}
prints
[2, 1]
Upvotes: 8
Reputation: 47183
Here's another approach, using a HashSet
of the strings for deduplication, but building the result list directly:
public static List<String> removeDupList(List<String> list, boolean ignoreCase) {
HashSet<String> seen = new HashSet<String>();
ArrayList<String> deduplicatedList = new ArrayList<String>();
for (String string : list) {
if (seen.add(ignoreCase ? string.toLowerCase() : string)) {
deduplicatedList.add(string);
}
}
return deduplicatedList;
}
This is fairly simple, makes only one pass over the elements, and does only a lowercase, a hash lookup, and then a list append for each element.
Upvotes: 0
Reputation: 888
Try
public boolean addAll(Collection<? extends String> c) {
for(String s : c) {
if(! this.contains(s)) {
this.add(s);
}
}
return super.addAll(c);
}
@Override
public boolean contains(Object o) {
//Do your checking here
// return super.contains(o);
}
This will make sure the contains method is called if you want the code to go through there.
Upvotes: 0
Reputation: 533500
contains
is not called as LinkedHashSet is not implemented that way.
If you want add() to call contains() you will need to override it as well.
The reason it is not implemented this way is that calling contains first would mean you are performing two lookups instead of one which would be slower.
Upvotes: 5