Reputation: 129
If I want to split a stream in Flink, what is the best way to do that?
I could use a process function and split the stream by using side outputs. Do watermarks get passed to the side outputs along with the elements so that the data in each side output can go downstream to other windowed operators?
Or, should I just use multiple filter() operations to filter a stream into multiple streams that each contain a subset of the elements? How are watermarks handled in this case? Are all watermarks passed to all filtered streams?
If both are possible, which is preferred (which has better performance)? Or is there a better way than either of the options described above?
Upvotes: 0
Views: 1788
Reputation: 43419
Side outputs are the generally preferred way to split a stream. They have the advantage of being able to split a stream n-ways, into streams of different types, and with excellent performance.
There is yet another way to split a stream that you didn't mention, which is via split and select. Split/select is NOT recommended. The implementation is something of a hack, and the performance isn't as good.
Upvotes: 4