Brandon
Brandon

Reputation: 23

BigQuery: Consolidate subselect into one row by date

I'm trying to have a BigQuery query result, which contains a subquery, return one row instead of two. I'm querying log files so all the data I need is in the same field. An example of the data in the field is as follow:

/?cv=p15.0.9350&ctyp=sp&bits=64&os_bits=64&hl=fr&hl=fr&os=win&osv=6.2   

The query I've been working on is below:

SELECT day, Win, Mac 
  FROM
    (SELECT DATE(metadata.timestamp) AS day, COUNT(DISTINCT protoPayload.resource) AS Win
     FROM [su_dashboard_streamed_logs.appengine_googleapis_com_request_log_20150424]
     WHERE protoPayload.resource CONTAINS 'ctyp=sp'
     GROUP BY day),
    (SELECT DATE(metadata.timestamp) AS day, COUNT(DISTINCT protoPayload.resource) AS Mac
     FROM [request_log_20150424]
     WHERE protoPayload.resource CONTAINS 'ctyp=sm'
     GROUP BY day)
ORDER BY day

Currently the query above returns:

Row day Win Mac  
1   2015-04-24  160516  null     
2   2015-04-24  null    109547  

I'd like the result to be:

Row day Win Mac
1 2015-04-24 160516 109547

Is there a way to do this? If so, any help would be greatly appreciated.

Thank you

Upvotes: 1

Views: 266

Answers (1)

Danny Kitt
Danny Kitt

Reputation: 3251

You want to JOIN the two sub-selects instead of unioning them. In BigQuery, a comma within a FROM clause indicates a union:

Note: Unlike many other SQL-based systems, BigQuery uses the comma syntax to indicate table unions, not joins.

If you JOIN on the day field, then you can compact the two rows into a single one with something like the following:

SELECT table_1.day as day, table_1.Win as Win, table_2.Mac AS Mac
  FROM
    (SELECT DATE(metadata.timestamp) AS day, COUNT(DISTINCT protoPayload.resource) AS Win
     FROM [su_dashboard_streamed_logs.appengine_googleapis_com_request_log_20150424]
     WHERE protoPayload.resource CONTAINS 'ctyp=sp'
     GROUP BY day) AS table_1
  JOIN
    (SELECT DATE(metadata.timestamp) AS day, COUNT(DISTINCT protoPayload.resource) AS Mac
     FROM [request_log_20150424]
     WHERE protoPayload.resource CONTAINS 'ctyp=sm'
     GROUP BY day) AS table_2
  ON table_1.day = table_2.day
ORDER BY day

Upvotes: 1

Related Questions