Reputation: 750

Elegant way to flatMap Set of Sets inside groupingBy

So I have a piece of code where I'm iterating over a list of data. Each one is a ReportData that contains a case with a Long caseId and one Ruling. Each Ruling has one or more Payment. I want to have a Map with the caseId as keys and sets of payments as values (i.e. a Map<Long, Set<Payments>>).

Cases are not unique across rows, but cases are.

In other words, I can have several rows with the same case, but they will have unique rulings.

The following code gets me a Map<Long, Set<Set<Payments>>> which is almost what I want, but I've been struggling to find the correct way to flatMap the final set in the given context. I've been doing workarounds to make the logic work correctly using this map as is, but I'd very much like to fix the algorithm to correctly combine the set of payments into one single set instead of creating a set of sets.

I've searched around and couldn't find a problem with the same kind of iteration, although flatMapping with Java streams seems like a somewhat popular topic.

rowData.stream()
        .collect(Collectors.groupingBy(
            r -> r.case.getCaseId(),
            Collectors.mapping(
                r -> r.getRuling(),
                Collectors.mapping(ruling->
                    ruling.getPayments(),
                    Collectors.toSet()
                )
            )));

Upvotes: 7

Answers (3)

Ousmane D.

Reputation: 56433

Another JDK8 solution:

Map<Long, Set<Payment>> resultSet = 
         rowData.stream()
                .collect(Collectors.toMap(p -> p.Case.getCaseId(),
                        p -> new HashSet<>(p.getRuling().getPayments()),
                        (l, r) -> { l.addAll(r);return l;}));

or as of JDK9 you can use the flatMapping collector:

rowData.stream()
       .collect(Collectors.groupingBy(r -> r.Case.getCaseId(), 
              Collectors.flatMapping(e -> e.getRuling().getPayments().stream(), 
                        Collectors.toSet())));

Upvotes: 6

Didier L

Reputation: 20579

The cleanest solution is to define your own collector:

Map<Long, Set<Payment>> result = rowData.stream()
        .collect(Collectors.groupingBy(
                ReportData::getCaseId,
                Collector.of(HashSet::new,
                        (s, r) -> s.addAll(r.getRuling().getPayments()),
                        (s1, s2) -> { s1.addAll(s2); return s1; })
        ));

Two other solutions to which I thought first but are actually less efficient and readable, but still avoid constructing the intermediate Map:

Merging the inner sets using Collectors.reducing():

Map<Long, Set<Payment>> result = rowData.stream()
        .collect(Collectors.groupingBy(
                ReportData::getCaseId,
                Collectors.reducing(Collections.emptySet(),
                        r -> r.getRuling().getPayments(),
                        (s1, s2) -> {
                            Set<Payment> r = new HashSet<>(s1);
                            r.addAll(s2);
                            return r;
                        })
        ));

where the reducing operation will merge the Set<Payment> of entries with the same caseId. This can however cause a lot of copies of the sets if you have a lot of merges needed.

Another solution is with a downstream collector that flatmaps the nested collections:

Map<Long, Set<Payment>> result = rowData.stream()
        .collect(Collectors.groupingBy(
                ReportData::getCaseId,
                Collectors.collectingAndThen(
                        Collectors.mapping(r -> r.getRuling().getPayments(), Collectors.toList()),
                        s -> s.stream().flatMap(Set::stream).collect(Collectors.toSet())))
        );

Basically it puts all sets of matching caseId together in a List, then flatmaps that list into a single Set.

Upvotes: 2

marstran

Reputation: 28036

There are probably better ways to do this, but this is the best I found:

 Map<Long, Set<Payment>> result =
            rowData.stream()
                    // First group by caseIds.
                    .collect(Collectors.groupingBy(r -> r.case.getCaseId()))
                    .entrySet().stream()
                    // By streaming over the entrySet, I map the values to the set of payments.
                    .collect(Collectors.toMap(
                            Map.Entry::getKey,
                            entry -> entry.getValue().stream()
                                    .flatMap(r -> r.getRuling().getPayments().stream())
                                    .collect(Collectors.toSet())));

Upvotes: 1

Elegant way to flatMap Set of Sets inside groupingBy

Answers (3)

Related Questions