BigQuery API PY Client Library - Difference between Client.query() vs Client.query_rows()

Question

I'm learning to use BigQuery APIs using Google BigQuery client library for Python v0.28 - https://googlecloudplatform.github.io/google-cloud-python/latest/bigquery/usage.html#queries

and unable to understand the difference between following two methods to query data

query(query, job_config=None, job_id=None, job_id_prefix=None, retry=)
query_rows(query, job_config=None, job_id=None, job_id_prefix=None, timeout=None, retry=)

What's the difference between the two and in what situation you'll use one over the other? To me, both seem to be a way we can query a table.

Thanks a lot for providing insight into it!

Cheers!

Willian Fuks · Accepted Answer

They are kind of the same indeed with one main difference: query_rows could be considered a synchronous implementation as it will wait for the job to complete and return for you an iterator with results.

Running it will block your python interpreter until the job is done.

query on the other hand is asynchronous, if you run it your interpreter will be free to execute other operations regardless whether the job is done or not (and it's up to you to wait for the job and get the results).

As you can see in the source code, query_rows calls self.query inside its method and then it calls result (which waits for the job to complete).

BigQuery API PY Client Library - Difference between Client.query() vs Client.query_rows()

Answers (1)

Related Questions