Joe Scotto
Joe Scotto

Reputation: 10887

Combine multiple SELECT FROM on both intraday and app_events tables in BigQuery?

Is there a way I can combine both of these select statements into one? I've tried doing:

FROM dataset1.app_events_intraday_*, dataset2.app_events_*, UNNEST (event_dim) AS event

When I run that I get a Column name event_dim is ambiguous which I'm not too sure why it would be. How could I run a query on both of these tables without having to run two unions?

Here is the full query for anyone interested:

WITH allTables AS (
  SELECT 
    CONCAT(user_dim.app_info.app_instance_id, ':', user_dim.app_info.app_platform) AS app,
    event.date,
    user_dim.app_info.app_instance_id AS users
  FROM `dataset1.app_events_intraday_*`, UNNEST(event_dim) AS event
  WHERE _TABLE_SUFFIX 
   BETWEEN '20170405' AND '20170405'

  UNION ALL
  SELECT 
    CONCAT(user_dim.app_info.app_instance_id, ':', user_dim.app_info.app_platform) AS app,
    event.date,
    user_dim.app_info.app_instance_id AS users
  FROM `dataset2.app_events_*`, UNNEST(event_dim) AS event
  WHERE _TABLE_SUFFIX 
   BETWEEN '20170405' AND '20170405'
) 
SELECT 
  COUNT(DISTINCT(users)) AS unique,
  COUNT(users) AS total
FROM allTables

Upvotes: 2

Views: 898

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173191

FROM dataset1.app_events_intraday_, dataset2.app_events_, UNNEST (event_dim) AS event
When I run that I get a Column name event_dim is ambiguous which I'm not too sure why it would be

event_dim is ambiguous because this filed exists in both app_events_intraday_* and app_events_* tables

How could I run a query on both of these tables without having to run two unions?

if you would have both sets of tables in the same dataset - you would be able to adjust FROM to something like

FROM dataset.app_events_*, UNNEST (event_dim) AS event   

Such that dataset.app_events_* pattern will cover both sets
In this case, though, you will need to adjust also WHERE clause to something like below

WHERE SUBSTR(_TABLE_SUFFIX, -8) BETWEEN '20170405' AND '20170405'   

now - bad news

Wildcard is limited for tables in the same dataset!
SO in you case you cannot eliminate use of UNION ALL

Upvotes: 3

Related Questions