Ahmad.S
Ahmad.S

Reputation: 809

Is there any way to get the taskManager Id within a map in Apache Flink?

Using custom partitioning in Apache Flink, we specify a key for each record to be assigned to a particular taskmanager.

Consider we broadcast a dataset to all of the nodes, taskmanagers. Is there any While in a map or faltmap to get the taskmanagef Id or not?

Upvotes: 2

Views: 1677

Answers (1)

Fabian Hueske
Fabian Hueske

Reputation: 18997

A custom partitioner does not assign records to a TaskManager but to a specific parallel task instance of the subsequent operator (a TM can execute multiple parallel task instances of the same operator).

You can access the ID of a parallel task instance, be extending a RichFunction, e.g., extend a RichMapFunction instead of implementing a MapFunction. Rich functions are available for all transformation. A RichFunction gives access to the RuntimeContext which tells you the ID of the parallel task instance:

public static class MyMapper extends RichMapFunction<Long, Long> {

    @Override
    public void open(Configuration config) {
        int pId = getRuntimeContext().getIndexOfThisSubtask();
    }
    
    @Override
    public Long map(Long value) throws Exception {
        // ...
    }
}

Upvotes: 4

Related Questions