Jason Law
Jason Law

Reputation: 1015

Why must the identity value be an identity for the combiner function in Stream.reduce?

I'm learning Java 8, in Reduction operations part in java.util.stream's package summary, it says:

More formally, the identity value must be an identity for the combiner function. This means that for all u, combiner.apply(identity, u) is equal to u. Additionally, the combiner function must be associative and must be compatible with the accumulator function: for all u and t, combiner.apply(u, accumulator.apply(identity, t)) must be equals() to accumulator.apply(u, t).

I don't understand why the identity value must be an identity for the combiner function. I think "combiner function must be associative and must be compatible" is enough to produce the same result either the stream is serial or parallel.

For example, I have a stream that has four elements e1, e2, e3, e4. If it's a serial stream, the result is identity ac e1 ac e2 ac e3 ac e4 (ac means accumulator function). If it's a parallel stream, the four elements can be split into two parts, [e1, e2] and [e3, e4], so the result is (identity ac e1 ac e2) co (identity ac e3 ac e4).

If given "the combiner function is associative and is compatible with the accumulator function", we can infer that "identity ac e1 ac e2 ac e3 ac e4" is equal to "(identity ac e1 ac e2) co (identity ac e3 ac e4)":

identity ac e1 ac e2 ac e3 ac e4
= (identity ac e1 ac e2 ac e3) co (identity ac e4) // because of compatibility
= (identity ac e1 ac e2) co (identity ac e3) co (identity ac e4) // because of compatibility
= (identity ac e1 ac e2) co ((identity ac e3) co (identity ac e4)) // because of associative property
= (identity ac e1 ac e2) co ((identity ac e3) ac e4) // because of compatibility
= (identity ac e1 ac e2) co (identity ac e3 ac e4)

So, why must the identity value be an identity for the combiner function?


Related questions:

Upvotes: 5

Views: 568

Answers (1)

TreffnonX
TreffnonX

Reputation: 2930

I don't understand why the identity value must be an identity for the combiner function.

It has to be, because otherwise if the provided stream is empty, no result could be generated. But the result of the reduction with identity is not optional. You can define a Java 8 reduction without an identity:

OptionalInt sum = Arrays.stream(new int[] { 0, 2, 6 }).reduce((a, b) -> a + b);

However if you do so, the empty stream will not return a valid result, as the reduction operation does not 'know' that 0 is the neutral element of the 'sum' operation you defined. Im mathematical terms, you are defining the identity element of the operation, so that there is a fallback in case the operation is not applicable at all. Visually you can describe the operation like this:

reduction = op(identity, a, b, c ...)

That is why you receive an OptionalInt, meaning that no integer value might have been calculated.

For a multiplication the identity is 1, for a sum 0 etc. As the identity must be neutral, it is not a fallback in the strictest sense, as a fallback can be anything, including -1 or any other value (must not be a numeric one). Thus OptionalInt offers a great alternative, because with an optional, you can apply the or(fallback) method, which does not need to be a neutral element of the reduction.

Upvotes: 2

Related Questions