Reputation: 63
I have a set of IDs (~200k) and I need to get all the rows in a BigQuery Table with those IDs. I tried to construct a list in python and pass it as a parameter to the SQL query using @
but I get TypeError: 'ArrayQueryParameter' object is not iterable
error. Here is the code I tried (very similar to https://cloud.google.com/bigquery/querying-data#running_parameterized_queries):
id_list = ['id1', 'id2']
query = """
SELECT id
FROM `my-db`
WHERE id in UNNEST(@ids)
"""
query_job = client.run_async_query(
str(uuid.uuid4()),
query,
query_parameters=(
bigquery.ArrayQueryParameter('ids', 'ARRAY<STRING>', id_list)
)
)
Upvotes: 3
Views: 10185
Reputation: 866
If you want to use the simple query like client.query
, not client.run_async_query
as shown in the answers above. You can to pass an additional parameter QueryJobConfig
. Simply add your arrays to query_parameters
using bigquery.ArrayQueryParameter
.
The following code worked for me:
query = f"""
SELECT distinct pipeline_commit_id, pipeline_id, name
FROM `{self.project_id}.{self.dataset_id}.pipelines_{self.table_suffix}`,
UNNEST(labels) AS label
where label.value IN UNNEST(@labels)
"""
job_config = bigquery.QueryJobConfig(
query_parameters=[
bigquery.ArrayQueryParameter('labels', 'STRING', labels)
]
)
query_job = self.client.query(query, job_config=job_config)
Based on those examples: https://cloud.google.com/bigquery/docs/parameterized-queries
Upvotes: 0
Reputation: 11
Above answers are a better solution but you may find a use for this too whe quickly drafting something in notebooks:
turn a list into a string of date values, comma-separated and in quotes. Then pass the string into the query like so:
id_list = ['id1', 'id2']
# format into a query valid string
id_string = '"'+'","'.join(id_list)+'"'
client = bigquery.Client()
query = f"""
SELECT id
FROM `my-db`
WHERE id in {id_string}
"""
query_job=client.query(query)
results = query_job.result()
Upvotes: 0
Reputation: 11777
Probably the issue here is that you are not passing a tuple to the function.
Try adding a comma before closing the parenthesis, like so:
id_list = ['id1', 'id2']
query = """
SELECT id
FROM `my-db`
WHERE id in UNNEST(@ids)
"""
query_job = client.run_async_query(
str(uuid.uuid4()),
query,
query_parameters=(
bigquery.ArrayQueryParameter('ids', 'STRING', id_list),
)
)
In Python if you do:
t = (1)
and then run:
type(t)
You will find the result to be int
. But if you do:
t = (1,)
Then it results in a tuple.
Upvotes: 6
Reputation: 33705
You need to use 'STRING'
rather than 'ARRAY<STRING>'
for the array element type, e.g.:
query_parameters=(
bigquery.ArrayQueryParameter('ids', 'STRING', id_list)
The example from the querying data topic is:
def query_array_params(gender, states):
client = bigquery.Client()
query = """
SELECT name, sum(number) as count
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE gender = @gender
AND state IN UNNEST(@states)
GROUP BY name
ORDER BY count DESC
LIMIT 10;
"""
query_job = client.run_async_query(
str(uuid.uuid4()),
query,
query_parameters=(
bigquery.ScalarQueryParameter('gender', 'STRING', gender),
bigquery.ArrayQueryParameter('states', 'STRING', states)))
query_job.use_legacy_sql = False
# Start the query and wait for the job to complete.
query_job.begin()
wait_for_job(query_job)
print_results(query_job.results())
Upvotes: 2