popcorny
popcorny

Reputation: 1936

Why is the Java 8 'Collector' class designed in this way?

We know Java 8 introduces a new Stream API and java.util.stream.Collector is the interface to define how to aggregate/collect the data stream.

However, the Collector interface is designed like this:

public interface Collector<T, A, R> {
    Supplier<A> supplier();
    BiConsumer<A, T> accumulator();
    BinaryOperator<A> combiner();
    Function<A, R> finisher();
}

Why is it not designed like the following?

public interface Collector<T, A, R> {
    A supply();
    void accumulate(A accumulator, T value);
    A combine(A left, A right);
    R finish(A accumulator);
}

The latter one is much easier to implement. What were the consideration to design it as the former one?

Upvotes: 36

Views: 2804

Answers (3)

Ira
Ira

Reputation: 764

2 related reasons

  • Functional composition via combinators. (Note you can still do OO composition but look at below point)
  • Possibility of framing business logic in succinct expressive code via lambda expression or method reference when assignment target is a functional interface.

    Functional composition

    Collectors API paves a way for functional composition via combinators .i.e. build small/smallest reusable functionality and combine some of these often in an interesting way into an advanced feature/function.

    succinct expressive code

    Below we are using function pointer (Employee::getSalary) to fill the functionality of mapper from Employee object to int. summingInt fills the logic of adding ints and hence combined together we have sum of salaries written out in a single line of declarative code.

    // Compute sum of salaries of employee int total = employees.stream() .collect(Collectors.summingInt(Employee::getSalary)));

Upvotes: 0

s.d
s.d

Reputation: 29436

Composition is favored over inheritance.

The first pattern in your question is sort of a module configuration. The implementations of the Collector interface can provide out varying implementations for Supplier, Accumulator, etc. This means one can compose Collector implementations from a existing pool of Supplier, Accumulator, etc. implementations. This also helps re-use and two Collectors might use the same Accumulator implementation. The Stream.collect() uses the supplied behaviors.

In the second pattern, the Collector implementation has to implement all functions by itself. All kinds of variations would need overriding the parent implementation. Not much scope to re-use, plus code duplication if two collectors have similar logic for a step, for example, accumulation.

Upvotes: 18

Tagir Valeev
Tagir Valeev

Reputation: 100209

Actually it was originally designed similarly to what you propose. See the early implementation in project lambda repository (makeResult is now supplier). It was later updated to the current design. I believe, the rationale of such update is to simplify collector combinators. I did not find any specific discussion on this topic, but my guess is supported by the fact that mapping collector appeared in the same changeset. Consider the implementation of Collectors.mapping:

public static <T, U, A, R>
Collector<T, ?, R> mapping(Function<? super T, ? extends U> mapper,
                           Collector<? super U, A, R> downstream) {
    BiConsumer<A, ? super U> downstreamAccumulator = downstream.accumulator();
    return new CollectorImpl<>(downstream.supplier(),
                               (r, t) -> downstreamAccumulator.accept(r, mapper.apply(t)),
                               downstream.combiner(), downstream.finisher(),
                               downstream.characteristics());
}

This implementation needs to redefine accumulator function only, leaving supplier, combiner and finisher as is, so you don't have additional indirection when calling supplier, combiner or finisher: you just call directly the functions returned by the original collector. It's even more important with collectingAndThen:

public static<T,A,R,RR> Collector<T,A,RR> collectingAndThen(Collector<T,A,R> downstream,
                                                            Function<R,RR> finisher) {
    // ... some characteristics transformations ...
    return new CollectorImpl<>(downstream.supplier(),
                               downstream.accumulator(),
                               downstream.combiner(),
                               downstream.finisher().andThen(finisher),
                               characteristics);
}

Here only finisher is changed, but original supplier, accumulator and combiner are used. As accumulator is called for every element, reducing the indirection could be pretty important. Try to rewrite mapping and collectingAndThen with your proposed design and you will see the problem. New JDK-9 collectors like filtering and flatMapping also benefit from current design.

Upvotes: 28

Related Questions