Maxim
Maxim

Reputation: 9961

How to run one source per task manager (or per node)?

I have implemented source which open fixed UDP port and listen it. So, I want to run exactly one source per task manager (in my case I run one task manager per node) because overwise a java.net.BindException: Address already in use exception will be thrown.

I notice this problem when test HA of Apache Flink. When I shut down one task manager the Apache Flick started trying to run two sources with the same port on one node.

So, how to run exactly one source per task manager (or per cluster node)?

Upvotes: 1

Views: 1257

Answers (1)

Till Rohrmann
Till Rohrmann

Reputation: 13346

It is currently not possible to dynamically enforce that exactly one task of a kind runs on each TaskManager. You can avoid that multiple source tasks are scheduled to the same machine by setting the number of slots to 1. However, then if you lose a machine and don't have a spare TaskManager, then you won't have enough slots to restart the job.

Alternatively, you could write your sources such that they are more resilient. For example, you could simple stop a source if they cannot bind to the specified port. Given that no other program can bind to the port, then you know that there is already another source task consuming data from this port.

Upvotes: 1

Related Questions