Felipe Hoffa
Felipe Hoffa

Reputation: 59325

Array_Agg does not support Window Frame in Snowflake - how to achieve?

I need to run:

select arrayagg(o_clerk) 
  within group (order by o_orderkey desc) 
  OVER (PARTITION BY o_orderkey order by o_orderkey 
     ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS RESULT
from sample_data

But Snowflake returns the error Sliding window frame unsupported for function ARRAYAGG. If I try to accumulate all without a sliding window, I get the error Cumulative window frame unsupported for function ARRAY_AGG.

How can I achieve this?

Sample data:

create or replace table sample_data as (
    with data as (
        select 1 a, [1,3,2,4,7,8,10] b
        union all select 2, [1,3,2,4,7,8,10]
    )

    select 'Ord'||a o_orderkey, 'c'||value o_clerk, index
    from data, table(flatten(b))
)
;

enter image description here

Desired result:

enter image description here

(source, for a BigQuery migration)

Upvotes: 1

Views: 2207

Answers (4)

Lukasz Szozda
Lukasz Szozda

Reputation: 176144

Starting from version 9.1 ARRAY_AGG window frames are natively supported:

SELECT *,ARRAY_AGG(o_clerk) OVER (PARTITION BY o_orderkey ORDER BY index 
                                  ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) RESULT
FROM sample_data;

Output:

enter image description here

Upvotes: 0

Emanuel Oliveira
Emanuel Oliveira

Reputation: 116

Thanks @Felipe Hoffa here's my solution posted initially in:

https://community.snowflake.com/s/question/0D73r000006upCECAY/detail?fromEmail=1&s1oid=00Di0000000hZh2&s1nid=0DB3100000001Fq&s1uid=0050Z000009Xcck&s1ext=0&emkind=chatterCommentNotification&emtm=1669252526852

select o_orderkey, 
    array_compact([
        lag(o_clerk, 3) over(partition by o_orderkey order by index)
        , lag(o_clerk, 2) over(partition by o_orderkey order by index)
        , lag(o_clerk, 1) over(partition by o_orderkey order by index)
        , o_clerk
    ])
from sample_data

enter image description here

Upvotes: 3

Lukasz Szozda
Lukasz Szozda

Reputation: 176144

Using the ARRAY_AGG and ARRAY_SLICE. Inspired by Rajat's answer

SELECT *
   ,IFF(ROW_NUMBER() OVER(PARTITION BY o_orderkey ORDER BY INDEX) <= 4, 0, 
        ROW_NUMBER() OVER(PARTITION BY o_orderkey ORDER BY INDEX)-4) AS start_index
   ,IFF(ROW_NUMBER() OVER(PARTITION BY o_orderkey ORDER BY INDEX) <= 4, 
        ROW_NUMBER() OVER(PARTITION BY o_orderkey ORDER BY INDEX),4) AS num_elem

   ,ARRAY_SLICE(
         ARRAY_AGG(o_clerk) WITHIN GROUP (ORDER BY INDEX)
                            OVER(PARTITION BY o_orderkey)
         ,start_index
         ,start_index + num_elem) 
FROM sample_data
ORDER BY O_ORDERKEY, INDEX;

Output:

enter image description here


The start_index and num_elem computation could be further simplified/generalized to emulate window sliding frame ROWS BETWEEN PRECEDING prec AND FOLLOWING foll.

SELECT *
   ,ROW_NUMBER() OVER(PARTITION BY o_orderkey ORDER BY INDEX) AS rn
   ,3 AS prec
   ,0 AS foll
   ,ARRAY_SLICE(
         ARRAY_AGG(o_clerk) WITHIN GROUP (ORDER BY INDEX) 
                            OVER(PARTITION BY o_orderkey)
         ,IFF(rn <= prec+1, 0, rn-(prec+1))
         ,IFF(rn <= prec+1, 0, rn-(prec+1)) + IFF(rn <= prec+1, rn+foll,prec+1+foll)
   ) 
FROM sample_data
ORDER BY O_ORDERKEY, INDEX;

Displaying each argument of ARRAY_SLICE for debug:

enter image description here

Upvotes: 2

Felipe Hoffa
Felipe Hoffa

Reputation: 59325

Since we can't get these results with ARRAY_AGG(), we can through a self join.

First we give each row a row number, and then we choose everything between its number and +3:

with numbered as (
    select o_orderkey, o_clerk, index
    from sample_data
), crossed as (
    select a.o_orderkey, a.index ai, b.index bi, b.o_clerk
    from numbered a
    join numbered b
    on a.o_orderkey = b.o_orderkey
    and a.index between b.index and b.index+3
)

select o_orderkey, array_agg(o_clerk) within group (order by bi)
from crossed
group by o_orderkey, ai
order by o_orderkey, max(bi)

Note that to the original question I had to add a index field - so we could have an unambiguous sorting order within rows.

enter image description here

Upvotes: 0

Related Questions