utpal sharma
utpal sharma

Reputation: 61

Java 8 Streams remove duplicate entries, keeping the element with the min date

I have Java beans like:

class OrderDeatils {
    Long orderId;
    Long userId;
    OrderInfo info;

    // Required getters/setters
}

class OrderInfo {
    OffsetDateTime orderCreatedDate;
    
    // Required getter/setter
}

I have data as List<OrderDeatils> list:

orderId.   userId.  OrderInfo[orderCreatedDate]. 
1001.       123.      2015/07/07 
1002.       124.      2015/08/07 
1003.       125.      2015/09/07 
1004.       123.      2015/08/07 

How, I can remove duplicate entry based on userId and keep data with min date as below:

orderId.   userId.  OrderInfo[orderCreatedDate]. 
1001.       123.      2015/07/07 
1002.       124.      2015/08/07 
1003.       125.      2015/09/07 

Should return a list of whole OrderDeatils objects.

I tried like:

 list.stream().collect(
                Collectors.groupingBy(OrderDeatils::getUserId,
                        Collectors.collectingAndThen(
                                Collectors.reducing((OrderDeatils d1, OrderDeatils d2)
                                        -> d1.getInfo.getOrderCreatedDate().isBefore(d2.getInfo.getOrderCreatedDate()) ? d1 : d2), Optional::get)));

But the response is not as expected, I am not getting updated List<OrderDeatils> as output.

Upvotes: 0

Views: 430

Answers (3)

M. Justin
M. Justin

Reputation: 21239

List<OrderDeatils> result = list.stream()
        .collect(Collectors.groupingBy(
                OrderDeatils::getUserId,
                Collectors.minBy(Comparator.comparing(o ->
                        o.getInfo().getOrderCreatedDate()))))
        .values().stream()
        .flatMap(Optional::stream)
        .toList();

This uses Collectors.groupingBy to group the results into a Map<Long, Optional<OrderDeatils>> where the key is the user ID and the value is the OrderDeatils with the earliest order created date that has the given user ID. The values are then converted to a List<OrderDeatils> using a stream over the map's values.

Upvotes: 0

Aethernite
Aethernite

Reputation: 273

You can use a simple set to do your filtering. Upon adding to the set - the .add() method will return true if it is a new value and false if the value already exists in the set.

For the sorting, you can use .sorted() and pass a comparator to sort by the wanted field.

Set<String> userIdsSet = new HashSet<>();
List<OrderDetails> filteredOrderDetails = list.stream()
            .filter(e -> userIdsSet.add(e.getUserId()))
            .sorted(Comparator.comparing(OrderDetails::getOrderCreatedDate()))
            .collect(Collectors.toList());

Upvotes: -2

Eugene
Eugene

Reputation: 120968

I will not write the code for you, but here are the steps:

  • you need a Collectors::toMap that can do a "merge" in the third argument. Something like:
...collect(Collectors.toMap(
     OrderDeatils::getUserId,
     Function.identity(),
     (left, right) ->
        left.getInfo.getOrderCreatedDate().isBefore(right.getInfo.getOrderCreatedDate()) ? left : right        
))
  • That will give you a Map<Long, OrderDeatils> from which you need values only.

Upvotes: 3

Related Questions