dev
dev

Reputation: 11319

How does control flow in Java8 collectors?

I am learning how to use Java 8 streams. While debugging this piece of code :

Collector<Person, StringJoiner, String> collector =  
    Collector.of(
        () -> new StringJoiner(" | "),
        (j,p) -> j.add(p.name.toLowerCase()),
        StringJoiner::merge,
        StringJoiner::toString);
        System.out.println(persons.stream().collect(collector));

execution never reaches StringJoiner::merge or StringJoiner::toString. If I replace the combiner (StringJoiner::merge) with null, then the code throws null pointer exception. I am unable to follow.

Additional (related) question :

How can I add logging for debugging lambdas ? I tried adding braces for multi-line code blocks. This does not compile :

Collector<Person, StringJoiner, String> collector =
    Collector.of(
        () -> {
        System.out.println("Supplier");
        new StringJoiner(" | ")},
        (j,p) -> j.add(p.name.toLowerCase()),
        StringJoiner::merge,
        StringJoiner::toString);

Upvotes: 2

Views: 275

Answers (1)

JB Nizet
JB Nizet

Reputation: 691865

Here's your code with debug statements added (I replaced Person with String, but it doesn't change anything):

    List<String> persons = Arrays.asList("John", "Mary", "Jack", "Jen");
    Collector<String, StringJoiner, String> collector =
        Collector.of(
            () -> {
                System.out.println("Supplier");
                return new StringJoiner(" | ");
            },
            (j, p) -> {
                System.out.println("Accumulator");
                j.add(p.toLowerCase());
            },
            (stringJoiner, other) -> {
                System.out.println("Combiner");
                return stringJoiner.merge(other);
            },
            (stringJoiner) -> {
                System.out.println("Finisher");
                return stringJoiner.toString();
            });
    System.out.println(persons.stream().collect(collector));

Run it, and you'll see that the finisher is definitely called:

  • a StringJoiner is created by the supplier
  • all persons are added to the joiner
  • the finisher transforms the joiner to a String

The combiner, however, although required by the method of(), which checks for null, is only relevant if the collector is used on a parallel stream, and the stream really decides to split the work on multiple threads, thus using multiple joiners and combining them together.

To test that, you'll need a high number of persons in the collection, and a parallel stream instead of a sequential one:

    List<String> persons = new ArrayList<>();
    for (int i = 0; i < 1_000_000; i++) {
        persons.add("p_" + i);
    }
    Collector<String, StringJoiner, String> collector =
        Collector.of(
            () -> {
                System.out.println("Supplier");
                return new StringJoiner(" | ");
            },
            (j, p) -> {
                System.out.println("Accumulator");
                j.add(p.toLowerCase());
            },
            (stringJoiner, other) -> {
                System.out.println("Combiner");
                return stringJoiner.merge(other);
            },
            (stringJoiner) -> {
                System.out.println("Finisher");
                return stringJoiner.toString();
            });
    System.out.println(persons.parallelStream().collect(collector));

The number of threads used is decided by the stream. And it can split the task done by one thread into yet two other threads in the middle if it thinks it's a good idea. Let's just assume it chooses to use 2:

  • two StringJoiners are created by the supplier, and a thread is allocated for each joiner
  • each thread adds half of the persons to its joiner
  • the two joiners are merged together by the combiner
  • the finisher transforms the merged joiner to a String

Upvotes: 3

Related Questions