Sadaf
Sadaf

Reputation: 257

Multiple Streams support in Apache Flink Job

My Question in regarding Apache Flink framework.

Is there any way to support more than one streaming source like kafka and twitter in single flink job? Is there any work around.Can we process more than one streaming sources at a time in single flink job?

I am currently working in Spark Streaming and this is the limitation there.

Is this achievable by other streaming frameworks like Apache Samza,Storm or NIFI?

Response is much awaited.

Upvotes: 7

Views: 8883

Answers (2)

Matthias J. Sax
Matthias J. Sax

Reputation: 62285

Yes, this is possible in Flink and Storm (no clue about Samza or NIFI...)

You can add as many source operators as you want and each can consume from a different source.

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

Properties properties = ... // see Flink webpage for more details    

DataStream<String> stream1 = env.addSource(new FlinkKafkaConsumer08<>("topic", new SimpleStringSchema(), properties);)
DataStream<String> stream2 = env.readTextFile("/tmp/myFile.txt");

DataStream<String> allStreams = stream1.union(stream2);

For Storm using low level API, the pattern is similar. See An Apache Storm bolt receive multiple input tuples from different spout/bolt

Upvotes: 7

Dennis Jaheruddin
Dennis Jaheruddin

Reputation: 21563

Some solutions have already been covered, I just want to add that in a NiFi flow you can ingest many different sources, and process them either separately or together.

It is also possible to ingest a source, and have multiple teams build flows on this without needing to ingest the data multiple times.

Upvotes: 0

Related Questions