CArnold
CArnold

Reputation: 535

Iterate over multiple ga_sessions_[date] tables in BigQuery

Is it possible to iterate over multiple ga_sessions tables in BigQuery?

I'm not asking about how to query multiple ga_sessions tables. Which can be done in Standard SQL using: select * from 'a:b.c.ga_sessions_*' where _TABLE_SUFFIX between '20180501' and '20180505'.

I am asking about using a loop with date for example the answer in While loop over tables uses variables and declare cursor.

The reason is because you can only query a limited number of tables using the Querying Multiple Tables Using a Wildcard Table method.

I suspect it is only possible using a programming language like Python, but wanted to check.

enter image description here

Upvotes: 1

Views: 734

Answers (2)

F10
F10

Reputation: 2893

You can use:

SELECT * 
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_*`

Or

SELECT * 
FROM `bigquery-public-data.google_analytics_sample.*` 
WHERE _TABLE_SUFFIX IN (SELECT table_id 
  FROM `bigquery-public-data.google_analytics_sample.__TABLES_SUMMARY__`
  WHERE table_id 
  LIKE 'ga_session_%')

To go through all the tables within the dataset. The limitations that may apply are for the query job ones (1000 tables referenced per query) and the current wildcard table queries limitations which doesn't limit a number of tables.

Upvotes: 2

Sergiiko
Sergiiko

Reputation: 319

The reason is because you can only query a limited number of tables using the Querying Multiple Tables Using a Wildcard Table method.

What limit are you talking about? I do not see such a limitation

Anyway, you can UNION several query results with wildcards, e.g.

SELECT * 
FROM `a:b.c.ga_sessions_*`
WHERE _TABLE_SUFFIX between '20180501' and '20180531' # MAY
UNION ALL 
SELECT * 
FROM `a:b.c.ga_sessions_*`
WHERE _TABLE_SUFFIX between '20180601' and '20180630' # JUNE
# and more ...

Upvotes: 0

Related Questions