Janik Zikovsky
Janik Zikovsky

Reputation: 3256

Flink DataStream - how to start a source from an input element?

Say I have a Flink SourceFunction<String> called RequestsSource.

On each request coming in from that source, I would like to subscribe to an external data source (for the purposes of an example, it could start a separate thread and start producing data on that thread).

The output data could be joined on a single DataStream. For example

Input Requests: A, B
Data produced:
 A1
 B1
 A2
 A3
 B2
 ...

... and so on, with new elements being added to the DataStream forever.

How do I write a Flink Operator that can do this? Can I use e.g. FlatMapFunction?

Upvotes: 0

Views: 707

Answers (2)

David Anderson
David Anderson

Reputation: 43439

It sounds you are asking about an operator that can emit one or more boundless streams of data based on a connection to an external service, after receiving subscription events. The only clean way I can see to do this is to do all the work in the SourceFunction, or in a custom Operator.

I don't believe async i/o can emit an unbounded stream of results from a single input event. A ProcessFunction can do that, but only via its onTimer method.

Upvotes: 1

kkrugler
kkrugler

Reputation: 9245

you'd typically want to use an AsyncFunction, which (asynchronously) can take one input element, call some external service, and emit a collection of results.

See also Apache Flink Training - Async IO.

-- Ken

Upvotes: 2

Related Questions