Joseph N
Joseph N

Reputation: 560

How to execute a Cloud Function after Dataflow job has succeeded?

I want to trigger Cloud Function only if Dataflow job execution completed successfully.

Cloud Function should not be triggered if Dataflow job is failed.

I am running a Dataflow job using a Dataflow template (jdbc to BigQuery) from the the Dataflow UI.

There is no option to trigger any Cloud Function or something after job execution. Also, I can't make changes in template code. What's the way to trigger Cloud Function?

Upvotes: 0

Views: 1663

Answers (4)

Niek Veenstra
Niek Veenstra

Reputation: 1

I have been looking for a solution for some time and found that you can use Google EventArc to publish an event for different types of dataflow job state. For example an HTTP request to a Cloud Function after your job is finished.

These documents provide more information:

  1. DataFlow Eventarc docs
  2. Eventarc documentation

Upvotes: 0

lebanoob
lebanoob

Reputation: 11

I've had to do something similar before.

https://beam.apache.org/releases/javadoc/2.5.0/org/apache/beam/sdk/PipelineResult.html

PipelineResult result = pipeline.run();
State s = result.waitUntilFinish();
if (s.compareTo(State.DONE) == 0)
    return callCloudFunction();

Then you can set up your cloud function to be triggered by an http request. https://cloud.google.com/functions/docs/calling/http

Upvotes: 1

Andrew
Andrew

Reputation: 855

You might find this feature request to trigger a Cloud Function when Dataflow has finished useful.

There is the ability to determine if a Dataflow job has failed or succeeded from the command line. You can list jobs and check their current status, for example, to check a single job, you can run:

gcloud beta dataflow jobs describe <JOB_ID>

The JDBC to BigQuery template source code can be found on GitHub. You could always create a custom template if you need to make any specific changes.

Upvotes: -1

guillaume blaquiere
guillaume blaquiere

Reputation: 75735

There isn't, yet, built it feature for this, but I can propose a workaround.

  • Go to Cloud Logging and go to the advanced filter (or to the new UI)
  • Enter this filter
resource.type="dataflow_step"
textPayload="Worker pool stopped."
  • Then create a sink (action -> create sink in the new UI)
  • Choose PubSub as sink destination (create a new topic for this)
  • Save
  • Then link the topic to your Cloud Function (either with a push subscription to a HTTP triggered Cloud Functions, or with a Topic Triggered Cloud Functions.

Like this, every end of dataflow job, a new message will be posted to PubSub and your Cloud Function will be triggered.

Upvotes: 3

Related Questions