Reputation: 3256
Say I have a Flink SourceFunction<String>
called RequestsSource
.
On each request coming in from that source, I would like to subscribe to an external data source (for the purposes of an example, it could start a separate thread and start producing data on that thread).
The output data could be joined on a single DataStream
. For example
Input Requests: A, B Data produced: A1 B1 A2 A3 B2 ...
... and so on, with new elements being added to the DataStream forever.
How do I write a Flink Operator that can do this? Can I use e.g. FlatMapFunction
?
Upvotes: 0
Views: 707
Reputation: 43439
It sounds you are asking about an operator that can emit one or more boundless streams of data based on a connection to an external service, after receiving subscription events. The only clean way I can see to do this is to do all the work in the SourceFunction, or in a custom Operator.
I don't believe async i/o can emit an unbounded stream of results from a single input event. A ProcessFunction can do that, but only via its onTimer method.
Upvotes: 1
Reputation: 9245
you'd typically want to use an AsyncFunction, which (asynchronously) can take one input element, call some external service, and emit a collection of results.
See also Apache Flink Training - Async IO.
-- Ken
Upvotes: 2