Reputation: 621
I read the source code , reduce will forword every result to downstream.
I want reduce a stream by key without window,
stream.keyBy(key)
.reduce((a, b) -> {
//reduce
return a+b;
});
if reduce on window, flink will forword element to downstream when watermark arrived, so how flink determine reduce finish without window.
Upvotes: 5
Views: 2717
Reputation: 651
If you do reduce without window, Flink will emit a partial aggregated record after each element the reduce operator encountered. It's very dangerous for the performance, and in most cases it's not what you want.
Here is a quick Scala example to show the problem:
package org.example
import org.apache.flink.streaming.api.scala.{StreamExecutionEnvironment, createTypeInformation}
object TestReduce {
def main(args: Array[String]): Unit = {
val env = StreamExecutionEnvironment.getExecutionEnvironment
val stream = env.fromCollection(Seq(
("k1", 1),
("k1", 1),
("k1", 1),
("k1", 1),
("k1", 1),
("k1", 1))
)
stream
.keyBy(_._1)
.reduce((p, q) => (p._1, p._2 + q._2))
.print()
env.execute("test reduce")
}
}
Running the code above will output:
1> (k1,1)
1> (k1,2)
1> (k1,3)
1> (k1,4)
1> (k1,5)
1> (k1,6)
Upvotes: 0
Reputation: 43439
With stream processing, in general there isn't the idea that computations "finish". They just keep going indefinitely. The non-windowed reduce just keeps on reducing for as long as you leave the job running.
Upvotes: 2
Reputation: 3592
According to the official documentation https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/
Reduce KeyedStream → DataStream
A "rolling" reduce on a keyed data stream. Combines the current element with the last reduced value and emits the new value.
Window Reduce WindowedStream → DataStream
Applies a functional reduce function to the window and returns the reduced value.
The key difference is that:
reduce
is done in a Window, the function combines the current value with the window one.reduce
is done in a KeyedStream, the function combines the current value with the latest one.Upvotes: 3