Reputation: 470
We're building an application which has two streams:
We want to connect the two streams in order to get shared state, so that the 1st stream can use the 2nd state for enrichment.
Every day or so, the parquet files (2nd streams's source) are updated, and that will require us to clear the state of the 2nd stream and rebuild it (will probably take about 2 minutes).
The question is, can we block/delay messages from the 1st stream while this process is running?
Thanks.
Upvotes: 0
Views: 299
Reputation: 166
Sounds a bit like your case is similar to Flip-23, which explores Model Serving in Apache Flink.
I think it all boils down to how (and if) your static stream is keyed:
ListState
as a buffer, how you can access this though depends on the shape of your data.It might help, if you shared a bit more info about the shape of your data (eg are you joining on a key? are you simply serving a model? other?).
Upvotes: 1
Reputation: 9245
There's currently no direct/easy way to block one stream on another stream, unfortunately. The typical solution is to buffer the ingest stream while you load (or re-load) the enrichment stream.
One approach you could try is to wrap your ingest stream in a custom SourceFunction
that knows when to not generate data, based on some external trigger (which is the same signal you'd use to know that you have Parquet data to re-load).
Upvotes: 1