PacificNW_Lover
PacificNW_Lover

Reputation: 5374

Count Duplicates in Java 8 using Streams Based on Field

Am trying to count how many items are duplicated from a list of Item objects. Items are duplicates if they have the same id.

e.g.

[5, 5, 2, 4, 2]

The ids 5 and 2 both occur more than once, so answer is 2.


public class Item {

    int id;

    public Item(int id) {
        this.id = id;
    }

    public int getId() {
        return id;
    }
}

public class DuplicateItems {

    public static int count(List<Item> items) {
        int count = 0;
        if (items.size() == 0) {
            return 0;
        }

        items.sort(Comparator.comparingInt(Item::getId));
        Map<Object, Long> resultMap = new HashMap<>();
        items.forEach(e -> resultMap.put(e, resultMap.getOrDefault(e, 0L) + 1L));
        System.out.println(resultMap.size());
        return count;
    }

    private static List<Items> convertToList(int[] values) {
        List<Item> items = new ArrayList<>();
        for (int num : values) {
            items.add(new Item(num));
        }
        return items;
    }

    public static void main(String[] args) {
        int[] itemsArray = {5, 5, 2, 4, 2};
        List<Item> items = convertToList(itemsArray);
        int duplicateCount = count(items);
        System.out.println("Duplicate Count: " + duplicateCount);
    }
}

When I run the program, it says this:

Duplicate Count: 5

Why is the value not 2?

Upvotes: 1

Views: 6829

Answers (3)

Singh Ravish K
Singh Ravish K

Reputation: 1

Just putting in code what @Jacob G correctly said.

    public class CountDuplicates {
    
    public static int count(List<Item> items) {
        int count = 0;
        if (items.size() == 0) {
            return 0;
        }

        items.sort(Comparator.comparingInt(Item::getId));
        System.out.println("items: " + items);
        Map<Object, Long> resultMap = new HashMap<>();
        items.forEach(e -> {
            System.out.println("e : " + e);
            resultMap.put(e, resultMap.getOrDefault(e, 0L) + 1L);   
        });
        System.out.println("Map size: " + resultMap.size());
        return count;
    }

    private static List<Item> convertToList(int[] values) {
        List<Item> items = new ArrayList<>();
        for (int num : values) {
            items.add(new Item(num));
        }
        return items;
    }

    public static void main(String[] args) {
        int[] itemsArray = {5, 5, 2, 4, 2};
        List<Item> items = convertToList(itemsArray);
        int duplicateCount = count(items);
        System.out.println("Duplicate Count: " + duplicateCount);
    }

}


class Item {

    int id;

    public Item(int id) {
        this.id = id;
    }

    public int getId() {
        return id;
    }
    
    @Override
    public String toString() {
        return id+"";
    }
    
    @Override
    public boolean equals(Object i) {
        return ((Integer)id).equals(((Item)i).id);
    }
    
    @Override
    public int hashCode() {
        return ((Integer)id).hashCode();
    }

}

Output is:

items: [2, 2, 4, 5, 5]

e : 2

e : 2

e : 4

e : 5

e : 5

Map size: 3

Duplicate Count: 0

Upvotes: 0

Eugene
Eugene

Reputation: 120848

you are doing so many steps that are misleading or wrong, why not simply:

items.stream()
     .map(Item::getId)
     .collect(Collectors.groupingBy(
         Function.identity(),
         Collectors.counting()
     ))
     .values()
     .stream()
     .filter(x -> x > 1)
     .count();

that is : first collect to a Map, then count only those values that are > 1

Upvotes: 3

WJS
WJS

Reputation: 40034

This puts them in a map based on frequency and then counts the number of values greater than 1.

       long dups = list2.stream()
       .collect(Collectors.groupingBy(Item::getId, Collectors.counting()))
               .values().stream().filter(i-> i > 1).count();

       System.out.println(dups);

Upvotes: 10

Related Questions