Reputation: 6124
I am running my pipeline in Dataflow. I want to collect all error messages from Dataflow job using its id. I am using Apache-beam 2.3.0 and Java 8.
DataflowPipelineJob dataflowPipelineJob = ((DataflowPipelineJob) entry.getValue());
String jobId = dataflowPipelineJob.getJobId();
DataflowClient client = DataflowClient.create(options);
Job job = client.getJob(jobId);
Is there any way to receive only error message from pipeline?
Upvotes: 0
Views: 706
Reputation: 7493
Programmatic support for reading Dataflow log messages is not very mature, but there are a couple options:
Since you already have the DataflowPipelineJob
instance, you could use the waitUntilFinish()
overload which accepts a JobMessagesHandler
parameter to filter and capture error messages. You can see how DataflowPipelineJob
uses this in its own waitUntilFinish()
implementation.
Alternatively, you can query job logs using the Dataflow REST API: projects.jobs.messages/list
. The API takes in a minimumImportance
parameter which would allow you to query just for errors.
Note that in both cases, there may be error messages which are not fatal and don't directly cause job failure.
Upvotes: 1