itsPav
itsPav

Reputation: 612

BigQuery Arrays in Array to Proper Columns

I'm trying to manipulate this;

id | data
1  | [[item1,2,1.99],[item2,1,4.99]]
2  | [[item1,2,1.99]]

into this:

id | A     | B | C    |
1  | item1 | 2 | 1.99 |
   | item2 | 1 | 4.99 |
2  | item1 | 2 | 1.99 |
...'

The closest I got is;

id | data |
1  | item1, 2, 1.99 |
   | item2, 1, 4.99 |
2  | item1, 2, 1.99 |

How do I split that array of strings into columns?

Here's my current query;

SELECT 
   id,
   ARRAY(SELECT * FROM UNNEST(SPLIT(SUBSTR(line_items, 3 , LENGTH(line_items) - 4),'], [')) AS line_items) AS line_items
FROM raw_data.sales 

This query does the same;

SELECT 
   id,
   SPLIT(SUBSTR(line_items, 3 , LENGTH(line_items) - 4),'], [') AS line_items
FROM raw_data.sales

Solution

SELECT
  ID id,
  ARRAY_AGG(STRUCT(SPLIT(A, ',')[OFFSET(0)] AS A,
  SPLIT(A, ',')[OFFSET(1)] AS B,
  SPLIT(A, ',')[OFFSET(2)] AS C))
FROM
  TABLE t,
  UNNEST(SPLIT(SUBSTR(DATA, 3 , LENGTH(DATA) - 4),'],[')) A
GROUP BY id

Upvotes: 0

Views: 3005

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

Below is for BigQuery Standard SQL

#standardSQL
SELECT id, 
  ARRAY(SELECT SPLIT(value, ',')[OFFSET(0)] FROM UNNEST(x.y) value) AS A,
  ARRAY(SELECT SPLIT(value, ',')[OFFSET(1)] FROM UNNEST(x.y) value) AS B,
  ARRAY(SELECT SPLIT(value, ',')[OFFSET(2)] FROM UNNEST(x.y) value) AS C
FROM `project.dataset.my_table`,
UNNEST([STRUCT(SPLIT(SUBSTR(DATA, 3, LENGTH(DATA) - 4),'],[') AS y)]) x

You can test, play with above using sample data from your question as in example below

#standardSQL
WITH `project.dataset.my_table` AS (
  SELECT 1 AS ID, '[[item1,2,1.99],[item2,1,4.99]]' AS DATA UNION ALL
  SELECT 2, '[[item1,2,1.99]]'
)
SELECT
  ID id, 
  ARRAY(SELECT SPLIT(value, ',')[OFFSET(0)] FROM UNNEST(x.y) value) AS A,
  ARRAY(SELECT SPLIT(value, ',')[OFFSET(1)] FROM UNNEST(x.y) value) AS B,
  ARRAY(SELECT SPLIT(value, ',')[OFFSET(2)] FROM UNNEST(x.y) value) AS C
FROM `project.dataset.my_table`,
UNNEST([STRUCT(SPLIT(SUBSTR(DATA, 3, LENGTH(DATA) - 4),'],[') AS y)]) x

with result

Row id  A       B   C    
1   1   item1   2   1.99     
        item2   1   4.99     
2   2   item1   2   1.99    

Upvotes: 2

Teddy
Teddy

Reputation: 645

Assuming your nested arrays have a constant length you should be able to use the below query.

EDIT: The returned value from the UNNEST() is a string, you simply need to use SPLIT() on the string returned to change its type to be an array. You can then use OFFSEET(n) to split it into different columns.

QUERY

-- CREATES A DUMMY TABLE WITH SAMPLE DATA 
WITH TABLE AS (
  SELECT
    1 AS ID,
    '[[item1,2,1.99],[item2,1,4.99]]' AS DATA
  UNION ALL
  SELECT
    2 AS ID,
    '[[item1,2,1.99]]' AS DATA
)

SELECT
  ID id,
  SPLIT(A, ',')[OFFSET(0)] AS A,
  SPLIT(A, ',')[OFFSET(1)] AS B,
  SPLIT(A, ',')[OFFSET(2)] AS C
FROM
  TABLE t,
  UNNEST(SPLIT(SUBSTR(DATA, 3 , LENGTH(DATA) - 4),'],[')) A

OUTPUT

enter image description here

Upvotes: 1

Related Questions