Reputation: 93
I want to use Flink's event timestamp and plan to implement a simple emitWatermark which is System.currentTimeInMillis - 10 secs. My understanding is tumbling window will fire start_time + window_interval + 10 secs. So if events arrives later than the watermark those events will be dropped.
Is there a way to write all the dropped events by Flink to a sink like S3?
Upvotes: 2
Views: 1418
Reputation: 46
It should be achievable with Side Outputs. The documentation of the sideOutputLateData
operator states the following:
Send late arriving data to the side output identified by the given {@link OutputTag}. Data is considered late after the watermark has passed the end of the window plus the allowed lateness set using {@link #allowedLateness(Time)}.
So then you can get the late data stream by the output tag and sink it to s3.
Upvotes: 3