Reputation: 1137
I have a table as shown below on BigQuery
Name | Score
Tim | 63 > 89 > 90
James| 67 > 44
I want to split the Score
column into N
separate columns where N is the maximum score length in the entire table. I would like the table to be as follow.
Name| Score_1 | Score_2 | Score_3
Tim | 63 | 89 | 90
James| 67 | 44 | 0 or NA
I tried the Split
command but I end up doing a new row for each Name-Score combination.
Upvotes: 0
Views: 599
Reputation: 172974
For BigQuery Standard SQL
Below is simple case and assumes you know in advance the expected max score length (3 in below example)
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT 'Tim' name, '63 > 89 > 90' score UNION ALL
SELECT 'James', '67 > 44'
)
SELECT
name,
score[SAFE_OFFSET(0)] AS score_1,
score[SAFE_OFFSET(1)] AS score_2,
score[SAFE_OFFSET(2)] AS score_3
FROM (
SELECT name, SPLIT(score, ' > ') score
FROM `project.dataset.your_table`
)
with result
Row name score_1 score_2 score_3
1 Tim 63 89 90
2 James 67 44 null
Of course above approach means - if you have many scores - like 10 or 20 or more - you will need to add respective number of extra lines like below
score[SAFE_OFFSET(20)] AS score_21
So, above gives you what you wanted from schema of output point of view
At the same time, below makes more sense to me and in most practical cases is better and most optimal :
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT 'Tim' name, '63 > 89 > 90' score UNION ALL
SELECT 'James', '67 > 44'
)
SELECT name, score
FROM `project.dataset.your_table`, UNNEST(SPLIT(score, ' > ')) score
with result
Row name score
1 Tim 63
2 Tim 89
3 Tim 90
4 James 67
5 James 44
Upvotes: 2