Ivan B.
Ivan B.

Reputation: 33

Group list of objects by key into list with sublists of unique objects (with java streams)

I have a list in which I have a combination of key and some additional objects, that are not related to each other in other way.

Considering this structure:

record A0(String id, String name, B b, C c) {}

record A(String id, String name, Set<B> bs, Set<C> cs) {}

record B(String id, String name) {}

record C(String id, String name) {}
a0s.add(new A0("1", "n1", new B("1", "nb1"), new C("1", "nc1")));
a0s.add(new A0("1", "n1", new B("1", "nb1"), new C("2", "nc2")));
a0s.add(new A0("1", "n1", new B("2", "nb2"), new C("3", "nc3")));
a0s.add(new A0("2", "n2", new B("2", "nb2"), new C("4", "nc4")));
a0s.add(new A0("2", "n2", new B("1", "nb1"), new C("5", "nc5")));
a0s.add(new A0("2", "n2", new B("2", "nb2"), new C("6", "nc6")));
a0s.add(new A0("3", "n3", new B("3", "nb3"), new C("7", "nc7")));
a0s.add(new A0("3", "n3", new B("3", "nb3"), new C("8", "nc8")));
a0s.add(new A0("4", "n4", new B("4", "nb4"), new C("9", "nc9")));
a0s.add(new A0("4", "n4", new B("5", "nb5"), new C("10", "nc10")));

I want to achieve this with java-streams:

[ {
  "id" : "1",
  "name" : "n1",
  "bs" : [ {
    "id" : "1",
    "name" : "nb1"
  }, {
    "id" : "2",
    "name" : "nb2"
  } ],
  "cs" : [ {
    "id" : "1",
    "name" : "nc1"
  }, {
    "id" : "2",
    "name" : "nc2"
  }, {
    "id" : "3",
    "name" : "nc3"
  } ]
}, {
  "id" : "2",
  "name" : "n2",
  "bs" : [ {
    "id" : "2",
    "name" : "nb2"
  }, {
    "id" : "1",
    "name" : "nb1"
  } ],
  "cs" : [ {
    "id" : "4",
    "name" : "nc4"
  }, {
    "id" : "5",
    "name" : "nc5"
  }, {
    "id" : "6",
    "name" : "nc6"
  } ]
}, {
  "id" : "3",
  "name" : "n3",
  "bs" : [ {
    "id" : "3",
    "name" : "nb3"
  } ],
  "cs" : [ {
    "id" : "7",
    "name" : "nc7"
  }, {
    "id" : "8",
    "name" : "nc8"
  } ]
}, {
  "id" : "4",
  "name" : "n4",
  "bs" : [ {
    "id" : "4",
    "name" : "nb4"
  }, {
    "id" : "5",
    "name" : "nb5"
  } ],
  "cs" : [ {
    "id" : "10",
    "name" : "nc10"
  }, {
    "id" : "9",
    "name" : "nc9"
  } ]
} ]

Here is my code without(obviously) java-streams:

import java.util.*;
import java.util.stream.Collectors;

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;

class Scratch {

  record A0(String id, String name, B b, C c) {}

  record A(String id, String name, Set<B> bs, Set<C> cs) {}

  record B(String id, String name) {}

  record C(String id, String name) {}

  public static void main(String[] args) throws JsonProcessingException {
    List<A0> a0s = new ArrayList<>();
    a0s.add(new A0("1", "n1", new B("1", "nb1"), new C("1", "nc1")));
    a0s.add(new A0("1", "n1", new B("1", "nb1"), new C("2", "nc2")));
    a0s.add(new A0("1", "n1", new B("2", "nb2"), new C("3", "nc3")));
    a0s.add(new A0("2", "n2", new B("2", "nb2"), new C("4", "nc4")));
    a0s.add(new A0("2", "n2", new B("1", "nb1"), new C("5", "nc5")));
    a0s.add(new A0("2", "n2", new B("2", "nb2"), new C("6", "nc6")));
    a0s.add(new A0("3", "n3", new B("3", "nb3"), new C("7", "nc7")));
    a0s.add(new A0("3", "n3", new B("3", "nb3"), new C("8", "nc8")));
    a0s.add(new A0("4", "n4", new B("4", "nb4"), new C("9", "nc9")));
    a0s.add(new A0("4", "n4", new B("5", "nb5"), new C("10", "nc10")));

    Set<A> collectA = new HashSet<>();
    Map<String, Set<B>> mapAB = new HashMap<>();
    Map<String, Set<C>> mapAC = new HashMap<>();

    a0s.forEach(
        a0 -> {
          mapAB.computeIfAbsent(a0.id, k -> new HashSet<>());
          mapAC.computeIfAbsent(a0.id, k -> new HashSet<>());
          mapAB.get(a0.id).add(a0.b);
          mapAC.get(a0.id).add(a0.c);
          collectA.add(new A(a0.id, a0.name, new HashSet<>(), new HashSet<>()));
        });

    Set<A> outA = new HashSet<>();

    collectA.forEach(
        a -> {
          outA.add(new A(a.id, a.name, mapAB.get(a.id), mapAC.get(a.id)));
        });

    ObjectMapper objectMapper = new ObjectMapper();
    objectMapper.enable(SerializationFeature.INDENT_OUTPUT);
    String json =
        objectMapper.writeValueAsString(
            outA.stream()
                .sorted(Comparator.comparing(A::id))
                .collect(Collectors.toList()));

    System.out.println(json);
  }
}

I have red posts and docs, but was unable to achieve it. This pointed me in some direction, but I was unable to continue combining with other solution and reading API docs. What "bugs", me is that I have multiple repeated objects to group(collect) and be unique. I am using Set to get advantage of the uniqueness, but could be List as well.

Upvotes: 2

Views: 899

Answers (2)

Alexander Ivanchenko
Alexander Ivanchenko

Reputation: 28988

groupingBy + teeing

One of the way to do that is to build the solution around standard Collectors.

For convince, we can introduce a couple of custom types.

A record which is meant to hold the unique properties id and name:

record IdName(String id, String name) {}

And another record for storing sets Set<B>, Set<C> associated with the same id:

record BCSets(Set<B> bs, Set<C> cs) {}

The logic of the stream:

  • Group the data using IdName as a Key by utilizing Collector groupingBy()
  • Make use of Collector teeing() as downstream of grouping. teeing() expects three arguments: two Collectors and a function combining the results produced by them. As both downstream Collectors of teeing() we can make use of the combination of mapping() and toSet(), and combine their results by generating an auxiliary record BCSets.
  • Then create a stream over the map entries and transform each entry into an instance of type A.
  • Sort the stream elements and collect them into a list.
List<A> listA = a0s.stream()
    .collect(Collectors.groupingBy(
        a0 -> new IdName(a0.id(), a0.name()),
        Collectors.teeing(
            Collectors.mapping(A0::b, Collectors.toSet()),
            Collectors.mapping(A0::c, Collectors.toSet()),
            BCSets::new
        )
    ))
    .entrySet().stream()
    .map(e -> new A(e.getKey().id(), e.getKey().name(), e.getValue().bs(), e.getValue().cs()))
    .sorted(Comparator.comparing(A::id))
    .toList();

groupingBy + custom Collector

Another option would be to create a custom Collector which would be used as the downstream of grouping()

For that, we need to define a custom accumulation type to consume elements from the stream and collect instances of B and C into sets. For convenience, I've implemented Consumer interface:

public static class ABCAccumulator implements Consumer<A0> {
    private Set<B> bs = new HashSet<>();
    private Set<C> cs = new HashSet<>();

    @Override
    public void accept(A0 a0) {
        bs.add(a0.b());
        cs.add(a0.c());
    }
    
    public ABCAccumulator merge(ABCAccumulator other) {
        bs.addAll(other.bs);
        cs.addAll(other.cs);
        return this;
    }
    
    // getters
}

To create a custom Collector, we can use static factory method Collector.of().

The overall logic remains the same with one difference - now we have only two collectors, and the type of values of auxiliary map is different as well (it would be ABCAccumulator).

List<A> listA = a0s.stream()
    .collect(Collectors.groupingBy(
        a0 -> new IdName(a0.id(), a0.name()),
        Collector.of(
            ABCAccumulator::new,
            ABCAccumulator::accept,
            ABCAccumulator::merge
        )
    ))
    .entrySet().stream()
    .map(e -> new A(e.getKey().id(), e.getKey().name(), e.getValue().getBs(), e.getValue().getCs()))
    .sorted(Comparator.comparing(A::id))
    .toList();

Upvotes: 3

Thiyagu
Thiyagu

Reputation: 17890

Until I can think of a better approach....

I was writing a solution using Collectors.teeing, but @Alexander Ivanchenko beat me to it. You can refer to that answer for how to achieve this using Collectors.teeing.


My initial code without using Collectors.teeing:

First, we group the elements in source list (a0s) by their id.

Map<String, List<A0>> groupById = a0s.stream()
        .collect(Collectors.groupingBy(A0::id));

Next, we stream the entries in the previous map and build A objects.

Set<A> outAResult = groupById.entrySet()
            .stream()
            .map(entry -> new A(entry.getKey(),
                    entry.getValue().get(0).name(), //since grouped by A0's id - name will be same for all elements
                    transform(entry.getValue(), A0::b),
                    transform(entry.getValue(), A0::c)))
            .collect(Collectors.toSet());

 private <T> Set<T> transform(List<A0> a0s, Function<A0, T> mapper) {
    return a0s.stream()
            .map(mapper)
            .collect(Collectors.toSet());
}

One issue is we have to stream the elements in List<A0> (each value in groupById map) twice to extract the List<B> and List<C>.

Note: Extracting the name of A0 object by entry.getValue().get(0).name() doesn't look great. To avoid this, you can create a temporary object (a record) which captures the id and name and group by that.

Upvotes: 2

Related Questions