gavinest
gavinest

Reputation: 348

Dynamic Query to apache_beam.io.gcp.bigquery.ReadFromBigQuery

I need to run a dynamic query to BigQuery in my Apache Beam pipeline. The query should be evaluated at runtime based on a value in the message. i.e select * from mytable where mycolumn = << dynamic value >>

I can't seem to get the Apache Beam io connectors to work with the dynamic query. Ideally the pipeline would look something like this:


from apache_beam import Create, Pipeline
from apache_beam.io.gcp.bigquery import ReadFromBigQuery

...


with Pipeline(argv=pipeline_args) as p:
    p | Create([{"foo": "bar"}]
      | "BQ" >> ReadFromBigQuery(
                    query=f"select * from mytable where mycolumn = {message['foo']}"
                )

Here are the docs of [apache_beam.io.gcp.bigquery.ReadFromBigquery][1].

Is there a way I can accomplish this?

Upvotes: 0

Views: 1498

Answers (1)

I&#241;igo
I&#241;igo

Reputation: 2670

You can use ReadAllFromBigQuery.

Example:

sql = "SELECT unique_key FROM `bigquery-public-data.austin_311.311_service_requests` WHERE  street_number='{number}'"


def to_bq_request(element, query):
    from apache_beam.io import ReadFromBigQueryRequest
    return ReadFromBigQueryRequest(query=query.format(number=element))
 
(p | Create([14301, 1580])
   | Map(to_bq_request, query=sql)
   | ReadAllFromBigQuery()
   | Map(logging.info)
)

Upvotes: 3

Related Questions