Alex
Alex

Reputation: 1537

Aggregate columns and build array in bigquery

Starting with a table like this

WITH table AS (
SELECT 1001 as ID, 1 As Color_blue, 0  AS Color_red,  0 AS Color_black UNION ALL
SELECT 1002 as ID, 0 As Color_blue, 0  AS Color_red,  1 AS Color_black UNION ALL
SELECT 1003 as ID, 0 As Color_blue, 1  AS Color_red,  0 AS Color_black UNION ALL
SELECT 1004 as ID, 0 As Color_blue, 0  AS Color_red,  1 AS Color_black )

SELECT *
FROM table

enter image description here

I want to aggregate all the features columns and build an array as following:

select ID, array<float64>[Color_blue, Color_red, Color_black] features
from table

enter image description here

However, I want to include all the columns names dynamically without hard coding them in my query. The output should stay the same. How do I create an array of floats with all the features, assuming I don't know the column names (only "ID")?

Upvotes: 0

Views: 49

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172974

Below is for BigQuery Standard SQL

#standardSQL
SELECT id, 
  ARRAY(
    SELECT CAST(val AS FLOAT64) 
    FROM UNNEST(SPLIT(TRIM(FORMAT('%t', t), '()'))) val 
    WHERE CAST(id AS STRING) != val
  ) AS features
FROM `project.dataset.table` t   

when applied to sample data from your question - result is

Row id      features     
1   1001    1.0  
            0.0  
            0.0  
2   1002    0.0  
            0.0  
            1.0  
3   1003    0.0  
            1.0  
            0.0  
4   1004    0.0  
            0.0  
            1.0  

Upvotes: 1

Related Questions