Reputation: 934
Trying to combine two columns into one in BQ,
my current table looks like:
+-------+--------------------------------------+------+-----+
| id | time | color1 |color2|type |
+-------+--------------------------------------+------+-----+
| 10954 | 2018-09-09 23:20:01.074 UTC | yellow | blue | 1 |
+-------+--------------------------------------+------+-----+
| 10954 | 2018-10-09 20:38:61.151 UTC | red | blue | 1 |
+-------+--------------------------------------+------+-----+
| 20562 | 2018-08-09 19:49:14.391 UTC | green | red | 0 |
+-------+--------------------------------------+------+-----+
| 20562 | 2017-09-09 17:02:22.903 UTC | green | red | 1 |
+-------+--------------------------------------+------+-----+
And my goal table would be:
+-------+--------------------------------------+------+
| id | time | color | type |
+-------+--------------------------------------+------+
| 10954 | 2018-09-09 23:20:01.074 UTC | yellow | 1 |
+-------+--------------------------------------+------+
| 10954 | 2018-10-09 20:38:61.151 UTC | red | 1 |
+-------+--------------------------------------+------+
| 10954 | 2018-09-09 23:20:01.074 UTC | blue | 0 |
+-------+--------------------------------------+------+
| 20562 | 2018-08-09 19:49:14.391 UTC | green | 0 |
+-------+--------------------------------------+------+
| 20562 | 2017-09-09 17:02:22.903 UTC | green | 1 |
+-------+--------------------------------------+------+
| 20562 | 2017-09-09 17:02:22.903 UTC | red | 0 |
+-------+--------------------------------------+------+
By doing so, this would create new rows for color2
, where id
would be duplicated, time
would be the min time for the id group, and type = 0
. Is it possible to do this in a case when
statement when creating the new color column or with a CTE?
Upvotes: 0
Views: 786
Reputation: 173210
Below is for BigQuery Standard SQL
#standardSQL
SELECT id, time, color1 AS color, type
FROM `project.dataset.table`
UNION ALL
SELECT id, MIN(time) AS time, color2 AS color, 0 type
FROM `project.dataset.table`
GROUP BY id, color2
You can test, play with above using dummy dta from your question as below
#standardSQL
WITH `project.dataset.table` AS (
SELECT 10954 id, '2018-09-09 23:20:01.074 UTC' time, 'yellow' color1, 'blue' color2, 1 type UNION ALL
SELECT 10954, '2018-10-09 20:38:61.151 UTC', 'red', 'blue', 1 UNION ALL
SELECT 20562, '2018-08-09 19:49:14.391 UTC', 'green', 'red', 0 UNION ALL
SELECT 20562, '2017-09-09 17:02:22.903 UTC', 'green', 'red', 1
)
SELECT id, time, color1 AS color, type
FROM `project.dataset.table`
UNION ALL
SELECT id, MIN(time) AS time, color2 AS color, 0 type
FROM `project.dataset.table`
GROUP BY id, color2
-- ORDER BY id
with result
Row id time color type
1 10954 2018-09-09 23:20:01.074 UTC yellow 1
2 10954 2018-10-09 20:38:61.151 UTC red 1
3 10954 2018-09-09 23:20:01.074 UTC blue 0
4 20562 2018-08-09 19:49:14.391 UTC green 0
5 20562 2017-09-09 17:02:22.903 UTC green 1
6 20562 2017-09-09 17:02:22.903 UTC red 0
Upvotes: 2