leigero
leigero

Reputation: 3283

Java remove duplicates from List store into other list?

I have a list of words which contains multiple duplicate words. I want to extract the words that are duplicated and store them in another list (maintaining the integrity of the original list).

I tried iterating through the list like you see below, but this fails logically because every 'dupe' will at some point be equal to primary. I really want to iterate through the list and for every String in the list check all the OTHER strings in the list for duplicates.

Is there a method in the List interface that allows this type of comparison?

For reference list 1 is a list of Strings.

for(String primary: list1){
    for(String dupe: list1){
        if(primary.equals(dupe)){
            System.out.print(primary + " " + dupe);
            ds3.add(primary);
        }
    }
}

EDIT:

I should note, that I'm aware that a Set doesn't allow for duplicates, but what I'm trying to do is OBTAIN the duplicates. I want to find them, and take them out and use them later. I'm not trying to eradicate them.

Upvotes: 1

Views: 1298

Answers (4)

Óscar López
Óscar López

Reputation: 236104

The easiest way to remove the duplicates is to add all elements into a Set:

Set<String> nodups = new LinkedHashSet<String>(list1);
List<String> ds3 = new ArrayList<String>(nodups);

In the above code, ds3 will be duplicate-free. Now, if you're interested in finding which elements are duplicate in O(n):

Map<String, Integer> counter = new LinkedHashMap<String, Integer>();
for (String s : list1) {
    if (counter.containsKey(s))
        counter.put(s, counter.get(s) + 1);
    else
        counter.put(s, 1);
}

With the above, it's easy to find the duplicated elements:

List<String> ds3 = new ArrayList<String>();
for (Map.Entry<String, Integer> entry : counter.entrySet())
    if (entry.getValue() > 1)
        ds3.add(entry.getKey());

Yet another way, also O(n): use a Set to keep track of the duplicated elements:

Set<String> seen = new HashSet<String>();
List<String> ds3 = new ArrayList<String>();
for (String s : list1) {
    if (seen.contains(s))
        ds3.add(s);
    else
        seen.add(s);
}

Upvotes: 4

gerrytan
gerrytan

Reputation: 41133

To obtain only the duplicates (as opposed to eliminating duplicates from the list), you can use a set as a temporary lookup table of what previous string has been visited:

Set<String> tmp = new HashSet<String>();
for(String primary: list1){
  if(tmp.contains(primary)) {
    // primary is a duplicate
  }
  tmp.add(primary);
}

Upvotes: 0

Peter Lawrey
Peter Lawrey

Reputation: 533660

The intent is to extract the duplicates not lose them entirely

List<String> list =
Set<String> set = new LinkedHashSet<>(); // to keep he order
List<String> dups = new ArrayList<String>(); // could be duplicate duplicates
for(String s: list)
    if (!set.add(s)) dups.add(s);

Upvotes: 1

james
james

Reputation: 26271

Consider using a Set. "A collection that contains no duplicate elements."

Upvotes: 1

Related Questions