Reputation: 176
I am new to Flink. I have a question that if all the messages sent to the downstream nodes are in order? For example,
[Stream] -> [DownStream]
Stream: [1,2,3,4,5,6,7,8,9]
Downstream get [3,2,1,4,5,7,6,8,9]
If so, how do we handle this case if we want it in order?
Any help would be very appreciated!
Upvotes: 0
Views: 533
Reputation: 43454
An operator can have multiple input channels. It will process the events from each channel in the order in which they were received. (Operators can also have multiple output channels.)
If your job has more than one pathway between stream and downstream, then the events can race and the the ordering will be non-deterministic. Otherwise the ordering will be preserved.
An example: Suppose you are reading, in parallel, from a Kafka topic with multiple partitions. Further imagine that all events from a given user are in the same Kafka partition (and are in order, by timestamp, for each user). Then in Flink you can use keyBy(user)
and be sure that the event stream for each user will remain in order. On the other hand, if the events for a given user are spread across multiple partitions, then keyBy(user)
will end up creating a stream of events for each user that is (almost certainly) out of order, because it will be pulling together events from several different FlinkKafkaConsumer
instances that are reading in parallel.
Upvotes: 1