Reputation: 809
Using custom partitioning in Apache Flink, we specify a key for each record to be assigned to a particular taskmanager.
Consider we broadcast a dataset to all of the nodes, taskmanagers. Is there any While in a map or faltmap to get the taskmanagef Id or not?
Upvotes: 2
Views: 1677
Reputation: 18997
A custom partitioner does not assign records to a TaskManager but to a specific parallel task instance of the subsequent operator (a TM can execute multiple parallel task instances of the same operator).
You can access the ID of a parallel task instance, be extending a RichFunction
, e.g., extend a RichMapFunction
instead of implementing a MapFunction
. Rich functions are available for all transformation. A RichFunction
gives access to the RuntimeContext
which tells you the ID of the parallel task instance:
public static class MyMapper extends RichMapFunction<Long, Long> {
@Override
public void open(Configuration config) {
int pId = getRuntimeContext().getIndexOfThisSubtask();
}
@Override
public Long map(Long value) throws Exception {
// ...
}
}
Upvotes: 4