Balduz
Balduz

Reputation: 3570

Get current task ID in Spark in Java

I need to get the ID of the current task in Spark. I have been searching in Google and in the official API but the only IDs I can find are the executor ID and the ID of the RDD. Does anyone know how to get the unique ID of a task? I have seen that the class TaskInfo has exactly what I am looking for, but I do not know how to get an instance of this class.

Upvotes: 12

Views: 6369

Answers (1)

MitsakosGR
MitsakosGR

Reputation: 531

In order to get the specific task ID you can use the TaskContext:

import org.apache.spark.TaskContext;

textFile.map( x -> {
    TaskContext tc = TaskContext.get();
    System.out.println(tc.taskAttemptId());
});

Bear in mind that the specific println will be printed on the node it is currently executed and not the drivers console.

Upvotes: 14

Related Questions