Viktor Ershov
Viktor Ershov

Reputation: 341

What is the proper way to use Google Pub/Sub with Flink Streaming using Dataproc?

I'm trying to figure out the proper way to run Apache Flink on Dataproc and use Google Pub/Sub as a source/sink. When I create a Dataproc cluster, after applying flink initialization action to the most recent image 1.4, Flink 1.6.4 will be installed.

The problem is that flink-connector-gcp-pubsub is only available starting from Flink version 1.9.0.

So my question is what is the proper way to use all of this together? Should I build my own gce image with the latest Flink? Is there one already existing?

Upvotes: 4

Views: 1504

Answers (2)

Viktor Ershov
Viktor Ershov

Reputation: 341

I solved this problem by running Flink 1.9.0 in Kubernetes. This way I do not depend on anybody and can run whatever version I need.

Upvotes: 2

Gio Gogiashvili
Gio Gogiashvili

Reputation: 71

As you already said flink-connector-gcp-pubusub is only available from Flink 1.9.0. So you have two options:

I would not recommend implementing connector as it is a complex task and requires an in-depth understanding of Flink while building your own image should be relatively easy given an example for Flink 1.6.4

Upvotes: 5

Related Questions