Samuel Neves
Samuel Neves

Reputation: 23

Way to trigger dataflow only after Big Query Job finished

actually the following steps to my data:

new objects in GCS bucket trigger a Google Cloud function that create a BigQuery Job to load this data to BigQuery.

I need low cost solution to know when this Big Query Job is finished and trigger a Dataflow Pipeline only after the job is completed.

Obs:

Upvotes: 2

Views: 2027

Answers (2)

guillaume blaquiere
guillaume blaquiere

Reputation: 75735

Despite your mention about Stackdriver logging, you can use it with this filter

resource.type="bigquery_resource"
protoPayload.serviceData.jobCompletedEvent.job.jobStatus.state="DONE"
severity="INFO"

You can add dataset filter in addition if needed.

Then create a sink into Function on this advanced filter and run your dataflow job.

If this doesn't match your expectation, can you detail why?

Upvotes: 1

Jayadeep Jayaraman
Jayadeep Jayaraman

Reputation: 2825

You can look at Cloud Composer which is managed Apache Airflow for orchestrating jobs in a sequential fashion. Composer creates a DAG and executes each node of the DAG and also checks for dependencies to ensure that things either run in parallel or sequentially based on the conditions that you have defined.

You can take a look at the example mentioned here - https://github.com/GoogleCloudPlatform/professional-services/tree/master/examples/cloud-composer-examples/composer_dataflow_examples

Upvotes: 1

Related Questions