Reputation: 121
I am currently stuck around implementing a workflow with Cloud Functions for my use case
For instance, I have one GCS bucket and 3 cloud functions.
Function 1
which would upload 3 files to the GCS bucket.Function 2
is triggered when an object(or file) is created/finalized in the GCS bucket. The function will load these files to different tables in BigQuery.Function 3
will run some queries on these BigQuery tables, and loads the query results to another set of tables in BigQuery.Now I can trigger Function3
from Function 2
using HTTP trigger. However, this would obviously trigger the Function3
3 times (i.e with every file created in GCS bucket). I would like to trigger Function3
only ONCE after all the files are loaded into the BigQuery tables. How would I achieve this? TIA
Upvotes: 0
Views: 123
Reputation: 6582
I think for you use case it’s better having an orchestration tools like Airflow/Cloud Composer or Cloud Workflows.
It will give you a better control on your tasks sequencing.
Composer
could be interesting if you have many DAG pipelines to orchestrate, otherwise it would be overkill only for one DAG, because Composer create a GKE Cluster. Moreover it’s not cost effective for a little number a DAGs.
For many DAGs it can be interesting because the code is simple, based on Python
and Composer
offers a complete managed solution with monitoring.
Cloud Workflow
is serverless, more lightweight and it can be more adapted for your need.
The code based on yaml
is verbose but can do the job for your use case.
TASK1/UPLOAD 3 FILES GCS >> TASK2/LOAD GCS FILES TO BQ TABLES >> TASK3/RUN QUERIES AND LOAD OTHER BQ TABLES
Upvotes: 1