BigQuery SQL job dependency on Dataflow pipeline

Question

I have an apache beam pipeline in python that has a flow like below for whatever reason.

client = bigquery.Client()
query_job1 = client.query('create table sample_table_1 as select * from table_1')  
result1 = query_job.result()

with beam.Pipeline(options=options) as p:

    records = (
            p
            | 'Data pull' >> beam.io.Read(beam.io.BigQuerySource(...))
            | 'Transform' >> ....
            | 'Write to BQ' >> beam.io.WriteToBigQuery(...)
    )

query_job2 = client.query('create table sample_table_2 as select * from table_2')  
result2 = query_job2.result()

SQL Job --> Datapipeline --> SQL Job

This sequence works fine when I run this locally. However when I was trying to run this as a Dataflow pipeline, it doesn't really run it in this order.

Is there a way to force the dependencies while running on dataflow?

BigQuery SQL job dependency on Dataflow pipeline

Answers (1)

Related Questions