Vincent Tang
Vincent Tang

Reputation: 157

Stream collectingAndThen get latest record to sum a value

Currently, I'm facing with below dataset. My aim is to get the latest sum of Column4 group by the first two-column.

// Column5 = version
new Foo(1, "bbb", "cccc", 111, 0)
new Foo(1, "bbb", "cccc", 234, 1) // latest
new Foo(1, "bbb", "dddd", 111, 0)
new Foo(1, "bbb", "dddd", 112, 1)
new Foo(1, "bbb", "dddd", 113, 2)
new Foo(1, "bbb", "dddd", 114, 3) // latest
new Foo(1, "xxx", "cccc", 111, 0) // latest
new Foo(2, "xxx", "yyyy", 0, 0)
new Foo(2, "xxx", "yyyy", 1, 1)   // latest
...

What I have tried is

// key: Column1, key: Column2, value: latest sum of Column4
Map<Long, Map<String, Integer>> fooMap = fooList.stream().collect(
    Collectors.groupingBy(Foo::getColumn1, Collectors.groupingBy(Foo::getColumn2,
            Collectors.collectingAndThen(????))));

whether the ???? part I've tried Collectors.groupingBy, Collectors.maxBy, Collectors.summingInt

but it is always wrong. 😭

My ideal Map should be something like below:

1->bbb->348,1->xxx->111, 2->xxx->1.

Please help 😵 Let me know if any supplement wants to have. Thanks.

Upvotes: 2

Views: 1450

Answers (2)

areus
areus

Reputation: 2948

You can get it with:

    Map<Long, Map<String, Integer>> fooMap = fooList.stream().collect(
            groupingBy(Foo::getColumn1,
                    groupingBy(Foo::getColumn2,
                            collectingAndThen(
                                    groupingBy(Foo::getColumn3,
                                            collectingAndThen(
                                                    maxBy(comparing(Foo::getVersion)),
                                                    Optional::get
                                            )),
                                    m -> m.values().stream().mapToInt(Foo::getColumn4).sum()
                            )
                    )
            ));

First group by column1 and column2, then we use a collectingAndThen for the grouping by column3, because we want to post process it.

Grouping by column3 we want to get the max by version, we use another collectingAndThen, because maxBy creates and Optional, so we apply an Optional::Get to get a Map<String, Foo> instead of a Map<String, Optional<Foo>>.

The post process is to sum all column4 of the Foo in the map, that are the ones with the max version.

Upvotes: 2

Naman
Naman

Reputation: 31868

Representing models for simplification as:

record Foo(Long one, String two, String three, int value, int version) {
}

record Result(Long one, String two, int totalValue) {
}

You can start with grouping by the first three attributes and mapping the value to an identity choosing the maximum version.

Map<List<Object>, Foo> groupedMaxVersion = fooList.stream()
        .collect(Collectors.toMap(foo -> Arrays.asList(foo.one(), foo.two(), foo.three()),
                foo -> foo, BinaryOperator.maxBy(Comparator.comparing(Foo::version))));

This could be followed by the summing downstream that you were looking for based on the value in column 4:

Map<List<Object>, Integer> resultMapping = groupedMaxVersion.entrySet().stream()
        .collect(Collectors.groupingBy(e -> Arrays.asList(e.getKey().get(0), e.getKey().get(1)),
                Collectors.summingInt(e -> e.getValue().value())));

Further, you just need to frame it onto the result data structure as desired

resultMapping.entrySet().stream()
                .map(e -> new Result((Long) e.getKey().get(0), (String) e.getKey().get(1), e.getValue()))
                .collect(Collectors.toList()); 

Upvotes: 2

Related Questions