Reputation: 535
Is it possible to iterate over multiple ga_sessions tables in BigQuery?
I'm not asking about how to query multiple ga_sessions tables.
Which can be done in Standard SQL using: select * from 'a:b.c.ga_sessions_*' where _TABLE_SUFFIX between '20180501' and '20180505'
.
I am asking about using a loop with date for example the answer in While loop over tables uses variables and declare cursor.
The reason is because you can only query a limited number of tables using the Querying Multiple Tables Using a Wildcard Table method.
I suspect it is only possible using a programming language like Python, but wanted to check.
Upvotes: 1
Views: 734
Reputation: 2893
You can use:
SELECT *
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_*`
Or
SELECT *
FROM `bigquery-public-data.google_analytics_sample.*`
WHERE _TABLE_SUFFIX IN (SELECT table_id
FROM `bigquery-public-data.google_analytics_sample.__TABLES_SUMMARY__`
WHERE table_id
LIKE 'ga_session_%')
To go through all the tables within the dataset. The limitations that may apply are for the query job ones (1000 tables referenced per query) and the current wildcard table queries limitations which doesn't limit a number of tables.
Upvotes: 2
Reputation: 319
The reason is because you can only query a limited number of tables using the Querying Multiple Tables Using a Wildcard Table method.
What limit are you talking about? I do not see such a limitation
Anyway, you can UNION
several query results with wildcards, e.g.
SELECT *
FROM `a:b.c.ga_sessions_*`
WHERE _TABLE_SUFFIX between '20180501' and '20180531' # MAY
UNION ALL
SELECT *
FROM `a:b.c.ga_sessions_*`
WHERE _TABLE_SUFFIX between '20180601' and '20180630' # JUNE
# and more ...
Upvotes: 0