Reputation: 1192
I have simple data in my BQ table:
Row make model cost color
1 hhkb hybrid 300 cream dark
2 durgod K320 120 grey cream
3 durgod hades 130 white black
4 rf pf2 300 white black
5 drop alt 200 white black silver bronze
6 idobao id67 200 white black silver rust
What I want to get to is splitting the color column on spaces and having the data presented as denormalise; a row with all details for every color option. So it would look like this:
Row make model cost color
1 hhkb hybrid 300 cream
2 hhkb dark
3 durgod K320 120 grey
4 durgod cream
5 durgod hades 130 white
6 durgod black
7 rf pf2 300 white
8 rf black
9 drop alt 200 white
10 drop black
11 drop silver
12 drop bronze
13 idobao id67 200 white
14 idobao black
15 idobao silver
16 idobao rust
I appreciate that whatever technique works for one colum, lets say 'make' can probably be repeated for other columns too, this is why in the example above I'm only asking about how to forward fill 'make' following a SPLIT on 'color' column.
What I have so far is:
SELECT
make,
model,
cost,
SPLIT(color, ' ') color,
LAST_VALUE(make IGNORE NULLS) OVER (PARTITION BY color ORDER BY model ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS maker,
FROM
`learnsql-xxxxxx.sampledata.keyboards`
LIMIT
1000
But this only results in output like:
This makes me think that there must be something wrong in the way I am splitting the color column as its not resulting in new rows.
Would really appreciate if anyone could give me a shove in the right direction! Needless to say I am brand new to SQL.
After a lot of experimentation I came up with this:
WITH keybs AS (
SELECT
make AS make,
model AS model,
cost AS cost,
SPLIT(color, ' ') color
FROM
`learnsql-xxxxxx.sampledata.keyboards`
)
SELECT
keybs.make,
keybs.model,
keybs.cost,
color
FROM keybs
CROSS JOIN UNNEST(keybs.color) AS color
LIMIT
1000
Which gives the required output. Is that the best way to do it?
Upvotes: 2
Views: 395
Reputation: 172993
Consider below approach
select t.* except(color), color
from `learnsql-xxxxxx.sampledata.keyboards` t
left join unnest(split(color, ' ')) color
In addition to just be [significantly] less verbose - it uses left join
instead of cross join
. the reason is - if some of rows do not have colors (null value) - with cross join
such row will be excluded from output, while with left join
such row will be preserved
Upvotes: 1