Reputation: 3069
Stream.reduce
has 3 method overloads.
reduce(BinaryOperator<T> accumulator)
reduce(T identity, BinaryOperator<T> accumulator)
reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)
I'm having a hard time understanding how third overload (Stream.reduce(identity, accumulator, combiner)
) works and what is a use case of that. So, how does it work, and why does that exists?
Upvotes: 2
Views: 358
Reputation: 40034
Note: Some of the examples are contrived for demonstration. In some instances a simple .sum()
could have been used.
The big difference, imo, is that the third form has a BiFunction
as a second argument instead of a BinaryOperator
. So you can use the third form to change the result type. It also has a BinaryOperator
as a combiner to combine the different results from parallel operations.
Generate some data
record Data(String name, int value) {}
Random r = new Random();
List<Data> dataList = r.ints(1000, 1, 20).mapToObj(i->new Data("Item"+i, i)).toList();
No parallel operation but different types. But the third argument is required so just return the sum.
int sum = dataList.stream().reduce(0, (item, data) -> item + data.value,
(finalSum, partialSum) -> finalSum);
System.out.println(sum);
prints
10162
The second form. Use map to get the value to be summed. BinaryOperator
used here since types are the same and no parallel operation.
sum = dataList.stream().map(Data::value).reduce(0, (sum1,val)->sum1+val);
System.out.println(sum); // print same as above
This shows the same as above but in parallel. The third argument accumulates partial sums. And those sums are accumulated as the next thread finishes so there may not be a sensible order to the output.
sum = dataList.parallelStream().reduce(0, (sum1, data) -> sum1 + data.value,
(finalSum, partialSum) -> {
System.out.println("Adding " + partialSum + " to " + finalSum);
finalSum += partialSum;
return finalSum;
});
System.out.println(sum);
prints something like the following
Adding 586 to 670
Adding 567 to 553
Adding 1256 to 1120
Adding 715 to 620
Adding 624 to 601
Adding 1335 to 1225
Adding 2560 to 2376
Adding 662 to 579
Adding 706 to 715
Adding 1421 to 1241
Adding 713 to 689
Adding 576 to 586
Adding 1402 to 1162
Adding 2662 to 2564
Adding 4936 to 5226
10162
One final note. None of the Collectors.reducing
methods have a BiFunction
to handle different types. To handle this the second argument is a Function
to act as a mapper so the third argument, a BinaryOperator
can collect the values.
sum = dataList.parallelStream().collect(
Collectors.reducing(0, Data::value, (finalSum, partialSum) -> {
System.out.println(
"Adding " + partialSum + " to " + finalSum);
return finalSum + partialSum;
}));
System.out.println(sum);
Upvotes: 2
Reputation: 50716
Basically it combines a mapping function with a reduction. Most of the examples I've seen for this don't really demonstrate why it's preferrable to calling map()
and a normal reduce()
in separate steps. The API Note comes in handy here:
Many reductions using this form can be represented more simply by an explicit combination of
map
andreduce
operations. Theaccumulator
function acts as a fused mapper and accumulator, which can sometimes be more efficient than separate mapping and reduction, such as when knowing the previously reduced value allows you to avoid some computation.
So let's say we have a Stream<String> numbers
, and we want to parse them to BigDecimal
and calculate their product. We could do something like this:
BigDecimal product = numbers.map(BigDecimal::new)
.reduce(BigDecimal.ONE, BigDecimal::multiply);
But this has an inefficiency. If one of the numbers is "0", we're wasting cycles converting the remainder to BigDecimal
. We can use the 3-arg reduce()
here to bypass the mapping logic:
BigDecimal product = numbers.reduce(BigDecimal.ONE,
(d, n) -> d.equals(BigDecimal.ZERO) ? BigDecimal.ZERO : new BigDecimal(n).multiply(d),
BigDecimal::multiply);
Of course it would be even more efficient to short-circuit the stream entirely, but that's tricky to do in a stream, especially in parallel. And this is just an example to get the concept across.
Upvotes: 1
Reputation: 2207
If I understand correctly, your question is about the third argument combiner
.
Firstly, one of the goals of Java was to have similar APIs for sequential and parallel streams.
The 3-argument version of reduce
is useful for parallel streams.
Suppose you are reducing from value of Collection<T>
to U
type and you are using parallel stream versions. The parallel stream splits the collection T into smaller streams and generates a u' value for each by using the second function. But now these different u' values have to be combined ? How do they get combined ? The third function is the one that provides that logic.
Upvotes: 1