Reputation: 6241

Identify duplicates in a List

I have a List of type Integer eg:

[1, 1, 2, 3, 3, 3]

I would like a method to return all the duplicates eg:

[1, 3]

What is the best way to do this?

Upvotes: 153

Answers (30)

panurg

Reputation: 339

And version which uses commons-collections CollectionUtils.getCardinalityMap method:

final List<Integer> values = Arrays.asList(1, 1, 2, 3, 3, 3);
final Map<Integer, Integer> cardinalityMap = CollectionUtils.getCardinalityMap(values);
System.out.println(cardinalityMap
            .entrySet()
            .stream().filter(e -> e.getValue() > 1)
            .map(e -> e.getKey())
            .collect(Collectors.toList()));

Upvotes: 3

crichton

Reputation: 11

So, this is how I solved it. It may be a bit more overhead, but returns exactly what OP wanted:

 public static List<Something> findDuplicatesInList(List<Something> somethingList) {

    List<Something> temp = somethingList
            .stream()
            .filter(alpha -> somethingList
                             .stream()
                             .filter(beta -> beta.equals(alpha))
                             .count() > 1)
            .collect(Collectors.toList());

    List<Something> duplicateSomethings = new ArrayList<>();

    for(Something something: temp){
        if(!duplicateSomethings.contains(something)){
            duplicateSomethings.add(something);
        }
    }

    return duplicateSomethings;
}

Upvotes: 0

leifg

Reputation: 8988

The method add of Set returns a boolean whether a value already exists (true if it does not exist, false if it already exists, see Set documentation).

So just iterate through all the values:

public Set<Integer> findDuplicates(List<Integer> listContainingDuplicates) { 
    final Set<Integer> setToReturn = new HashSet<>(); 
    final Set<Integer> set1 = new HashSet<>();
         
    for (Integer yourInt : listContainingDuplicates) {
        if (!set1.add(yourInt)) {
            setToReturn.add(yourInt);
        }
    }
    return setToReturn;
}

Upvotes: 218

Suman Mitra

Reputation: 99

List.of(1, 1, 3, 4, 5, 5, 6).stream()
   .collect(Collectors.collectingAndThen
                 (Collectors.groupingBy(Function.identity()),
                   map -> map.entrySet()
                             .stream()
                             .filter(e -> e.getValue().size() > 1)
                             .map(Map.Entry::getKey)
                             .collect(Collectors.toList())));

Upvotes: 1

wilmol

Reputation: 1910

Similar to some answers here, but if you want to find duplicates based on some property:

  public static <T, R> Set<R> findDuplicates(Collection<? extends T> collection, Function<? super T, ? extends R> mapper) {
    Set<R> uniques = new HashSet<>();
    return collection.stream()
        .map(mapper)
        .filter(e -> !uniques.add(e))
        .collect(toSet());
  }

Upvotes: 4

Igor Maculewicz

Reputation: 31

More generic method as variant of https://stackoverflow.com/a/52296246

    /**
     * Returns a duplicated values found in given collection based on fieldClassifier
     *
     * @param collection given collection of elements
     * @param fieldClassifier field classifier which specifies element to check for duplicates(useful in complex objects).
     * @param <T> Type of element in collection
     * @param <K> Element which will be returned from method in fieldClassifier.
     * @return returns list of values that are duplocated.
     */
    public static <T, K> List<K> lookForDuplicates(List<T> collection, Function<? super T, ? extends K> fieldClassifier) {

        return collection.stream().collect(Collectors.groupingBy(fieldClassifier))
                         .entrySet()
                         .stream()
                         .filter(e -> e.getValue().size() > 1)
                         .map(Map.Entry::getKey)
                         .collect(Collectors.toList());
    }

Upvotes: 0

Grim

Reputation: 1986

A thread-safe alternative is this:

/**
 * Returns all duplicates that are in the list as a new {@link Set} thread-safe.
 * <p>
 * Usually the Set will contain only the last duplicate, however the decision
 * what elements are equal depends on the implementation of the {@link List}. An
 * exotic implementation of {@link List} might decide two elements are "equal",
 * in this case multiple duplicates might be returned.
 * 
 * @param <X>  The type of element to compare.
 * @param list The list that contains the elements, never <code>null</code>.
 * @return A set of all duplicates in the list. Returns only the last duplicate.
 */
public <X extends Object> Set<X> findDuplicates(List<X> list) {
    Set<X> dups = new LinkedHashSet<>(list.size());
    synchronized (list) {
        for (X x : list) {
            if (list.indexOf(x) != list.lastIndexOf(x)) {
                dups.add(x);
            }
        }
    }
    return dups;
}

Upvotes: 1

Ashish Tyagi

Reputation: 837

I took Sebastian's answer and added a keyExtractor to it -

    private <U, T> Set<T> findDuplicates(Collection<T> collection, Function<? super T,? extends U> keyExtractor) {
        Map<U, T> uniques = new HashMap<>(); // maps unique keys to corresponding values
        return collection.stream()
            .filter(e -> uniques.put(keyExtractor.apply(e), e) != null)
            .collect(Collectors.toSet());
    }

Upvotes: 1

snowman

Reputation: 566

Here is a solution using Streams with Java 8

// lets assume the original list is filled with {1,1,2,3,6,3,8,7}
List<String> original = new ArrayList<>();
List<String> result = new ArrayList<>();

You just look if the frequency of this object is more than once in your list. Then call .distinct() to only have unique elements in your result

result = original.stream()
    .filter(e -> Collections.frequency(original, e) > 1)
    .distinct()
    .collect(Collectors.toList());
// returns {1,3}
// returns only numbers which occur more than once

result = original.stream()
    .filter(e -> Collections.frequency(original, e) == 1)
    .collect(Collectors.toList());
// returns {2,6,8,7}
// returns numbers which occur only once

result = original.stream()
    .distinct()
    .collect(Collectors.toList());
// returns {1,2,3,6,8,7}
// returns the list without duplicates

Upvotes: 40

Donald Raab

Reputation: 6686

If you use Eclipse Collections, this will work:

MutableList<Integer> list = Lists.mutable.with(1, 1, 2, 3, 3, 3);
Set<Integer> dupes = list.toBag().selectByOccurrences(i -> i > 1).toSet();
Assert.assertEquals(Sets.mutable.with(1, 3), dupes);

Update: As of Eclipse Collections 9.2 you can now use selectDuplicates

MutableList<Integer> list = Lists.mutable.with(1, 1, 2, 3, 3, 3);
Set<Integer> dupes = list.toBag().selectDuplicates().toSet();
Assert.assertEquals(Sets.mutable.with(1, 3), dupes);

You can also use primitive collections to accomplish this:

IntList list = IntLists.mutable.with(1, 1, 2, 3, 3, 3);
IntSet dupes = list.toBag().selectDuplicates().toSet();
Assert.assertEquals(IntSets.mutable.with(1, 3), dupes);

Note: I am a committer for Eclipse Collections.

Upvotes: 3

John Strickler

Reputation: 25421

I needed a solution to this as well. I used leifg's solution and made it generic.

private <T> Set<T> findDuplicates(Collection<T> collection) {

    Set<T> duplicates = new LinkedHashSet<>();
    Set<T> uniques = new HashSet<>();

    for(T t : collection) {
        if(!uniques.add(t)) {
            duplicates.add(t);
        }
    }

    return duplicates;
}

Upvotes: 63

user9426229

Reputation:

just in case for those that also want to include both the duplicate and the non duplicates. basically the answer similiar to the correct answer but instead of returning from if not part you return the else part

use this code (change to the type that you need)

public Set<String> findDup(List<String> Duplicates){
    Set<String> returning = new HashSet<>();
    Set<String> nonreturning = new HashSet<>();
    Set<String> setup = new HashSet<>();
    for(String i:Duplicates){
        if(!setup.add( i )){
            returning.add( i );
        }else{
            nonreturning.add( i );
        }
    }
    Toast.makeText( context,"hello set"+returning+nonreturning+" size"+nonreturning.size(),Toast.LENGTH_SHORT ).show();
    return nonreturning;
}

Upvotes: 0

Karthik Deepan

Reputation: 144

How about this code -

public static void main(String[] args) {

    //Lets say we have a elements in array
    int[] a = {13,65,13,67,88,65,88,23,65,88,92};

    List<Integer> ls1 = new ArrayList<>();
    List<Integer> ls2 = new ArrayList<>();
    Set<Integer> ls3 = new TreeSet<>();

    //Adding each element of the array in the list      
    for(int i=0;i<a.length;i++) {
     {
    ls1.add(a[i]);
    }
    }

    //Iterating each element in the arrary
    for (Integer eachInt : ls1) {

    //If the list2 contains the iterating element, then add that into set<> (as this would be a duplicate element)
        if(ls2.contains(eachInt)) {
            ls3.add(eachInt);
        }
        else {ls2.add(eachInt);}

    }

    System.out.println("Elements in array or ls1"+ls1); 
    System.out.println("Duplicate Elements in Set ls3"+ls3);


}

Upvotes: 0

بلال المصمودي

Reputation: 471

java 8 base solution:

List duplicates =    
list.stream().collect(Collectors.groupingBy(Function.identity()))
    .entrySet()
    .stream()
    .filter(e -> e.getValue().size() > 1)
    .map(Map.Entry::getKey)
    .collect(Collectors.toList());

Upvotes: 46

aayoustic

Reputation: 109

This would be a good method to find Duplicate values, without using Set.

public static <T> List<T> findDuplicates(List<T> list){

List<T> nonDistinctElements = new ArrayList<>();

  for(T s : list)
    if(list.indexOf(s) != list.lastIndexOf(s))
      if(!nonDistinctElements.contains(s))
        nonDistinctElements.add(s);

  return nonDistinctElements;
}

And say, that you want a method that returns you a distinct list, i.e. if you pass a list where elements are occurring more than once, you'll get a list with distinct elements.

public static <T> void distinctList(List<T> list){

List<T> nonDistinctElements = new ArrayList<>();
for(T s : list)
  if(list.indexOf(s) != list.lastIndexOf(s))
    nonDistinctElements.add(s);

for(T nonDistinctElement : nonDistinctElements)
  if(list.indexOf(nonDistinctElement) != list.lastIndexOf(nonDistinctElement))
    list.remove(nonDistinctElement);
}

Upvotes: 0

Baisakha Chauhan

Reputation: 31

public class practicese {
       public static void main(String[] args) {   

           List<Integer> listOf = new ArrayList<Integer>();
           listOf.add(3);
           listOf.add(1);
           listOf.add(2);
           listOf.add(3);
           listOf.add(3);
           listOf.add(2);
           listOf.add(1);

           List<Integer> tempList = new ArrayList<Integer>();
           for(Integer obj:listOf){
                if(!tempList.contains(obj)){
                    tempList.add(obj);

                }
            }
            System.out.println(tempList);

    }

}

Upvotes: 2

Sebastian

Reputation: 1932

I took John Strickler's solution and remade it to use the streams API introduced in JDK8:

private <T> Set<T> findDuplicates(Collection<T> collection) {
    Set<T> uniques = new HashSet<>();
    return collection.stream()
        .filter(e -> !uniques.add(e))
        .collect(Collectors.toSet());
}

Upvotes: 62

Brenden

Reputation: 1

import java.util.Scanner;

public class OnlyDuplicates {
    public static void main(String[] args) {
        System.out.print(" Enter a set of 10 numbers: ");
        int[] numbers = new int[10];
        Scanner input = new Scanner(System.in);
        for (int i = 0; i < numbers.length; i++) {
            numbers[i] = input.nextInt();
        }
        numbers = onlyDuplicates(numbers);
        System.out.print(" The numbers are: ");
        for (int i = 0; i < numbers.length; i++) {
            System.out.print(numbers[i] + "");
        }
    }

    public static int[] onlyDuplicates(int[] list) {
        boolean flag = true;
        int[] array = new int[0];
        array = add2Array(array, list[0]);
        for (int i = 0; i < list.length; i++) {
            for (int j = 0; j < array.length; j++) {
                if (list[i] == array[j]) {
                    flag = false;
                    break;
                }
            }
            if (flag) {
                array = add2Array(array, list[i]);
            }
            flag = true;
        }
        return array;
    }
    // Copy numbers1 to numbers2
    // If the length of numbers2 is less then numbers2, return false
    public static boolean copyArray(int[] source, int[] dest) {
        if (source.length > dest.length) {
            return false;
        }

        for (int i = 0; i < source.length; i++) {
            dest[i] = source[i];
        }
        return true;
    }
    // Increase array size by one and add integer to the end of the array
    public static int[] add2Array(int[] source, int data) {
        int[] dest = new int[source.length + 1];
        copyArray(source, dest);
        dest[source.length] = data;
        return dest;
    }
}

Upvotes: 0

APA

Reputation: 172

Lambas might be a solution

Integer[] nums =  new Integer[] {1, 1, 2, 3, 3, 3};
List<Integer> list = Arrays.asList(nums);

List<Integer> dps = list.stream().distinct().filter(entry -> Collections.frequency(list, entry) > 1).collect(Collectors.toList());

Upvotes: 5

User Learning

Reputation: 3473

Just try this :

Example if List values are: [1, 2, 3, 4, 5, 6, 4, 3, 7, 8] duplicate item [3, 4].

Collections.sort(list);
        List<Integer> dup = new ArrayList<>();
        for (int i = 0; i < list.size() - 1; i++) {
            if (list.get(i) == list.get(i + 1)) {
                if (!dup.contains(list.get(i + 1))) {
                    dup.add(list.get(i + 1));
                }
            }
        }
        System.out.println("duplicate item " + dup);

Upvotes: -1

Samrat Roy

Reputation: 37

public class DuplicatesWithOutCollection {

    public static void main(String[] args) {

        int[] arr = new int[] { 2, 3, 4, 6, 6, 8, 10, 10, 10, 11, 12, 12 };

        boolean flag = false;
        int k = 1;
        while (k == 1) {

            arr = removeDuplicate(arr);
            flag = checkDuplicate(arr, flag);
            if (flag) {
                k = 1;
            } else {
                k = 0;
            }

        }

    }

    private static boolean checkDuplicate(int[] arr, boolean flag) {
        int i = 0;

        while (i < arr.length - 1) {

            if (arr[i] == arr[i + 1]) {

                flag = true;

            } else {
                flag = false;
            }
            i++;

        }

        return flag;
    }

    private static int[] removeDuplicate(int[] arr) {

        int i = 0, j = 0;
        int[] temp = new int[arr.length];
        while (i < arr.length - 1) {

            if (arr[i] == arr[i + 1]) {

                temp[j] = arr[i + 1];
                i = i + 2;

            } else {

                temp[j] = arr[i];
                i = i + 1;

                if (i == arr.length - 1) {
                    temp[j + 1] = arr[i + 1];
                    break;
                }

            }
            j++;

        }
        System.out.println();
        return temp;
    }

}

Upvotes: 0

Adriaan Koster

Reputation: 16209

This also works:

public static Set<Integer> findDuplicates(List<Integer> input) {
    List<Integer> copy = new ArrayList<Integer>(input);
    for (Integer value : new HashSet<Integer>(input)) {
        copy.remove(value);
    }
    return new HashSet<Integer>(copy);
}

Upvotes: 7

Christophe Roussy

Reputation: 16999

Compact generified version of the top answer, also added empty check and preallocated Set size:

public static final <T> Set<T> findDuplicates(final List<T> listWhichMayHaveDuplicates) {
    final Set<T> duplicates = new HashSet<>();
    final int listSize = listWhichMayHaveDuplicates.size();
    if (listSize > 0) {
      final Set<T> tempSet = new HashSet<>(listSize);
      for (final T element : listWhichMayHaveDuplicates) {
        if (!tempSet.add(element)) {
          duplicates.add(element);
        }
      }
    }
    return duplicates;
  }

Upvotes: 1

Philipp Paland

Reputation: 341

Using Guava on Java 8

private Set<Integer> findDuplicates(List<Integer> input) {
    // Linked* preserves insertion order so the returned Sets iteration order is somewhat like the original list
    LinkedHashMultiset<Integer> duplicates = LinkedHashMultiset.create(input);

    // Remove all entries with a count of 1
    duplicates.entrySet().removeIf(entry -> entry.getCount() == 1);

    return duplicates.elementSet();
}

Upvotes: 8

Ashkan Aryan

Reputation: 3534

int[] nums =  new int[] {1, 1, 2, 3, 3, 3};
Arrays.sort(nums);
for (int i = 0; i < nums.length-1; i++) {

    if (nums[i] == nums[i+1]) {
        System.out.println("duplicate item "+nums[i+1]+" at Location"+(i+1) );
    }

}

Obviously you can do whatever you want with them (i.e. put in a Set to get a unique list of duplicate values) instead of printing... This also has the benefit of recording the location of duplicate items too.

Upvotes: 15

Alex

Reputation: 758

If you know the maximum value (for example < 10000) you could sacrifice space for speed . I Can’t remember exact name of this technique.

pseudo code:

//does not handle case when mem allocation fails 
//probably can be extended to unknown values /larger values .
maybe by sorting first
public List<int> GetDuplicates(int max)
{   
    //allocate and clear memory to 0/false
    bit[] buckets=new bit[max]
    memcpy(buckets,0,max);
    //find duplicates
    List<int> result=new List<int>();
    foreach(int val in List)
    {
        if (buckets[val])
        {
            result.add(value);
        }
        else
        {
            buckets[val]=1;
        }
    }
    return  result
}

Upvotes: -1

Stephen Swensen

Reputation: 22297

This is a problem where functional techniques shine. For example, the following F# solution is both clearer and less bug prone than the best imperative Java solution (and I work daily with both Java and F#).

[1;1;2;3;3;3] 
|> Seq.countBy id 
|> Seq.choose (fun (key,count) -> if count > 1 then Some(key) else None)

Of course, this question is about Java. So my suggestion is to adopt a library which brings functional features to Java. For example, it could be solved using my own library as follows (and there are several others out there worth looking at too):

Seq.of(1,1,2,3,3,3)
.groupBy(new Func1<Integer,Integer>() {
    public Integer call(Integer key) {
        return key;
    }
}).filter(new Predicate<Grouping<Integer,Integer>>() {
   public Boolean call(Grouping<Integer, Integer> grouping) {
        return grouping.getGrouping().count() > 1;
   }
}).map(new Func1<Grouping<Integer,Integer>,Integer>() {
    public Integer call(Grouping<Integer, Integer> grouping) {
        return grouping.getKey();
    }
});

Upvotes: 0

James DW

Reputation: 1815

This should work for sorted and unsorted.

public void testFindDuplicates() {

    List<Integer> list = new ArrayList<Integer>();
    list.add(1);
    list.add(1);
    list.add(2);
    list.add(3);
    list.add(3);
    list.add(3);

    Set<Integer> result = new HashSet<Integer>();
    int currentIndex = 0;
    for (Integer i : list) {
        if (!result.contains(i) && list.subList(currentIndex + 1, list.size()).contains(i)) {
            result.add(i);
        }
        currentIndex++;
    }
    assertEquals(2, result.size());
    assertTrue(result.contains(1));
    assertTrue(result.contains(3));
}

Upvotes: 1

amit

Reputation: 178451

create a Map<Integer,Integer>, iterate the list, if an element is in the map, increase it's value, otherwise add it to the map with key=1
iterate the map, and add to the lists all elements with key>=2

public static void main(String[] args) {
        List<Integer> list = new LinkedList<Integer>();
        list.add(1);
        list.add(1);
        list.add(1);
        list.add(2);
        list.add(3);
        list.add(3);
        Map<Integer,Integer> map = new HashMap<Integer, Integer>();
        for (Integer x : list) { 
            Integer val = map.get(x);
            if (val == null) { 
                map.put(x,1);
            } else {
                map.remove(x);
                map.put(x,val+1);
            }
        }
        List<Integer> result = new LinkedList<Integer>();
        for (Entry<Integer, Integer> entry : map.entrySet()) {
            if (entry.getValue() > 1) {
                result.add(entry.getKey());
            }
        }
        for (Integer x : result) { 
            System.out.println(x);
        }

    }

Upvotes: 1

Eng.Fouad

Reputation: 117597

You can use something like this:

List<Integer> newList = new ArrayList<Integer>();
for(int i : yourOldList)
{
    yourOldList.remove(i);
    if(yourOldList.contains(i) && !newList.contains(i)) newList.add(i);
}

Upvotes: 6

Identify duplicates in a List

Answers (30)

Related Questions