Reputation: 23
I'm trying to have a BigQuery query result, which contains a subquery, return one row instead of two. I'm querying log files so all the data I need is in the same field. An example of the data in the field is as follow:
/?cv=p15.0.9350&ctyp=sp&bits=64&os_bits=64&hl=fr&hl=fr&os=win&osv=6.2
The query I've been working on is below:
SELECT day, Win, Mac
FROM
(SELECT DATE(metadata.timestamp) AS day, COUNT(DISTINCT protoPayload.resource) AS Win
FROM [su_dashboard_streamed_logs.appengine_googleapis_com_request_log_20150424]
WHERE protoPayload.resource CONTAINS 'ctyp=sp'
GROUP BY day),
(SELECT DATE(metadata.timestamp) AS day, COUNT(DISTINCT protoPayload.resource) AS Mac
FROM [request_log_20150424]
WHERE protoPayload.resource CONTAINS 'ctyp=sm'
GROUP BY day)
ORDER BY day
Currently the query above returns:
Row day Win Mac
1 2015-04-24 160516 null
2 2015-04-24 null 109547
I'd like the result to be:
Row day Win Mac
1 2015-04-24 160516 109547
Is there a way to do this? If so, any help would be greatly appreciated.
Thank you
Upvotes: 1
Views: 266
Reputation: 3251
You want to JOIN
the two sub-selects instead of unioning them. In BigQuery, a comma within a FROM
clause indicates a union:
Note: Unlike many other SQL-based systems, BigQuery uses the comma syntax to indicate table unions, not joins.
If you JOIN
on the day field, then you can compact the two rows into a single one with something like the following:
SELECT table_1.day as day, table_1.Win as Win, table_2.Mac AS Mac
FROM
(SELECT DATE(metadata.timestamp) AS day, COUNT(DISTINCT protoPayload.resource) AS Win
FROM [su_dashboard_streamed_logs.appengine_googleapis_com_request_log_20150424]
WHERE protoPayload.resource CONTAINS 'ctyp=sp'
GROUP BY day) AS table_1
JOIN
(SELECT DATE(metadata.timestamp) AS day, COUNT(DISTINCT protoPayload.resource) AS Mac
FROM [request_log_20150424]
WHERE protoPayload.resource CONTAINS 'ctyp=sm'
GROUP BY day) AS table_2
ON table_1.day = table_2.day
ORDER BY day
Upvotes: 1