titutubs
titutubs

Reputation: 365

Aggregate window functions with an ORDER BY clause require a frame clause SQL

I am working in Redshift for my query. I am trying to fill in the last non-null value for user purchases. However I keep getting this error: Aggregate window functions with an ORDER BY clause require a frame clause. I am not sure why. Here is my current query:

    with table_a as (
    select 
    user_id,
    date,
    ,SUM(CASE WHEN purchase_amount IS NULL THEN 0 ELSE 1 END) OVER (PARTITION BY user_id ORDER BY date rows between unbounded preceding and unbounded following) AS grp

    )
    select *,
  FIRST_VALUE(purchase_amount) OVER (PARTITION BY grp, user_id ORDER BY date) AS test
 from table_a 

Upvotes: 0

Views: 3381

Answers (1)

Bill Weiner
Bill Weiner

Reputation: 11082

The frame clause is needed to indicate which subset of rows in the frame are to be considered. When you add an order by clause the database expects that you want to only look at a subset of the frame. For example let’s take your SUM() window function.

The window you have now:

SUM(CASE WHEN purchase_amount IS NULL THEN 0 ELSE 1 END) OVER (PARTITION BY user_id ORDER BY date rows between unbounded preceding and unbounded following)

Is exactly the same as the window function without the order by and the frame clause. It will produce the sum of all the rows in the partition on every row in the partition - the exact same value on every row. This simpler code looks like:

SUM(CASE WHEN purchase_amount IS NULL THEN 0 ELSE 1 END) OVER (PARTITION BY user_id)

The order by clause isn’t needed to perform this calculation so when you add it the database thinks you are making a mistake.

If you instead wanted a rolling sum, then an order by clause is needed but so is the frame clause. These define which rows are summed for each row. A rolling sum would look like:

SUM(CASE WHEN purchase_amount IS NULL THEN 0 ELSE 1 END) OVER (PARTITION BY user_id ORDER BY date rows between unbounded preceding and current row)

The short answer is that the database thinks you want to make a sub selection of rows in the frame when you have an order by clause.

Upvotes: 0

Related Questions