Diego-S
Diego-S

Reputation: 77

Flatten Nested Array and Aggregate in Snowflake

My table column has nested arrays in a Snowflake database. I want to perform some aggregations using SQL (Snowflake SQL).

My table name is: DATA

The PROJ column is of VARIANT data type. The nested arrays will not always be 3, and I demonstrated that in the DATA table.


| ID |             PROJ              | LOCATION |
|----|-------------------------------|----------|
| 1  |[[0, 4], [1, 30], [10, 20]]    |    S     |
| 2  |[[0, 2], [1, 20]]              |    S     |
| 3  |[[0, 8], [1, 10], [10, 100]]   |    S     |

Desired Output:

| Index | LOCATION |  Min | Max | Mean|
|-------|----------|------|-----|-----|
| 0     |    S     |  2   |  8  | 4.66|
| 1     |    S     |  10  |  30 | 20  |
| 10    |    S     |  20  |  100| 60  |

Upvotes: 3

Views: 918

Answers (1)

Lukasz Szozda
Lukasz Szozda

Reputation: 176144

First the nested array should be flattened, then Index is the first element of subarray and Value is the second element(array is 0-based):

CREATE OR REPLACE TABLE DATA
AS
SELECT 1 AS ID, [[0, 4], [1, 30], [10, 20]]   AS PROJ UNION
SELECT 2 AS ID, [[0, 2], [1, 20]]             AS PROJ UNION
SELECT 3 AS ID, [[0, 8], [1, 10], [10, 100]]  AS PROJ;

Query:

SELECT s.VALUE[0]::INT AS Index,
       MIN(s.VALUE[1]::INT) AS MinValue,
       MAX(s.VALUE[1]::INT) AS MaxValue,
       AVG(s.VALUE[1]::INT) AS MeanValue
FROM DATA
,LATERAL FLATTEN(input=> PROJ) s
GROUP BY s.VALUE[0]::INT
ORDER BY Index;

Output:

enter image description here

Upvotes: 3

Related Questions