Reputation: 519
I have a very special data table that i want to transform for a visualisation purpose (see image). I am aware of the cross join unnest which does the equivalent of a cartesian product, however in this case i need to reduplicate depending on the value next to the *
Upvotes: 0
Views: 2652
Reputation: 173046
Below is yet another simple option for BigQuery Standard SQL (based on use of REPEAT function)
#standardSQL
SELECT id, value
FROM `project.dataset.table` t,
UNNEST(SPLIT(REGEXP_REPLACE(t.values, r'\[|]', ''))) x,
UNNEST(SPLIT(REPEAT(
CONCAT(',', SPLIT(x, '*')[OFFSET(0)]),
IFNULL(CAST(SPLIT(x, '*')[SAFE_OFFSET(1)] AS INT64), 1)
))) value
WHERE value != ''
You can test, play with above using sample data from your question as in example below
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 AS id, '[5*2,8,6]' AS `values` UNION ALL
SELECT 2, '[5*2,0*3]' UNION ALL
SELECT 3, '[1*1,2,5,6]'
)
SELECT id, value
FROM `project.dataset.table` t,
UNNEST(SPLIT(REGEXP_REPLACE(t.values, r'\[|]', ''))) x,
UNNEST(SPLIT(REPEAT(
CONCAT(',', SPLIT(x, '*')[OFFSET(0)]),
IFNULL(CAST(SPLIT(x, '*')[SAFE_OFFSET(1)] AS INT64), 1)
))) value
WHERE value != ''
with result
Row id value
1 1 5
2 1 5
3 1 8
4 1 6
5 2 5
6 2 5
7 2 0
8 2 0
9 2 0
10 3 1
11 3 2
12 3 5
13 3 6
Upvotes: 1
Reputation: 33755
Here is a solution using a SQL UDF, alternatively. This should be faster since it avoids the latency of using the JavaScript v8 sandbox:
CREATE TEMP FUNCTION ExpandList(input STRING) AS (
ARRAY(
-- Find the value before the *
SELECT SPLIT(elem, '*')[OFFSET(0)]
-- For each comma-separated element inside the braces
FROM UNNEST(REGEXP_EXTRACT_ALL(input, r'[^\[\],]+')) AS elem,
-- Repeated by the value after the *, or once if there is no *
UNNEST(GENERATE_ARRAY(1, IFNULL(CAST(SPLIT(elem, '*')[SAFE_OFFSET(1)] AS INT64), 1))))
);
WITH Input AS (
SELECT 1 AS id, '[5*2,8,6]' AS values UNION ALL
SELECT 2, '[5*2,0*3]' UNION ALL
SELECT 3, '[1,2,5,6]'
)
SELECT id, value
FROM Input,
UNNEST(ExpandList(values)) AS value;
Upvotes: 3
Reputation: 3642
Below is an SQL example with a TEMP function which demonstrate how you can flatten your array
CREATE TEMP FUNCTION flatten(input ARRAY<STRING>)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
let flatten = []
for (let j = 0; j < input.length; j++) {
if (input[j].indexOf('*') === -1) {
flatten.push(input[j])
} else {
let prefix = input[j].split('*')[1]
let value = input[j].split('*')[0]
for (let i = 0; i < prefix; i++) {
flatten.push(value)
}
}
}
return flatten
""";
WITH numbers AS
(SELECT ['5*2','8','6'] as value
)
SELECT flatten(value) as product
FROM numbers;
The output of this call as requested is:
Upvotes: 2