SmcMichael
SmcMichael

Reputation: 67

Get the Average of a Datediff function using a partition by in Snowflake

I am looking to understand what the average amount of days between transactions is for each of the customers in my database using Snowflake.

SELECT 
      Customer_ID
    , Day_ID
    , DATEDIFF(DAY, LAG(Day_ID) 
          OVER(PARTITION BY Customer_ID 
                   ORDER BY DAY_ID), DAY_ID) AS Time_Since
FROM TABLE
ORDER BY
          Customer_ID
        , Day_ID

The code above works to get me the time_elapsed but when I try to add an average function I get an error:

SELECT 
      Customer_ID
      AVG(DATEDIFF(DAY, LAG(Day_ID) 
              OVER(PARTITION BY Customer_ID 
                       ORDER BY DAY_ID), DAY_ID)) AS AVG_Time_Since
FROM TABLE
ORDER BY Customer_ID
GROUP BY Customer_ID

The error reads:

SQL compilation error: 
Window function [
LAG(TABLE.DAY_ID) 
OVER (PARTITION BY TABLE.CUSTOMER_ID 
          ORDER BY TABLE.DAY_ID ASC NULLS LAST)
] may not appear inside an aggregate function.

Any ideas?

Upvotes: 1

Views: 1392

Answers (1)

Nat Taylor
Nat Taylor

Reputation: 1108

You can nest them and get the answer you're seeking.

Note: You can simply delete the cte from the beginning of this and replace from cte with from YourTable

WITH CTE AS (
             SELECT 
                       COLUMN1       AS CUSTOMER_ID
                     , COLUMN2::DATE AS DAY_ID
             FROM
             VALUES
                       (1, '2019-01-01'), (1, '2019-01-06'), (1, '2019-01-15')
                     , (1, '2019-01-25'), (1, '2019-01-27'), (1, '2019-01-31')
                     , (2, '2019-01-01'), (2, '2019-01-08'), (2, '2019-01-13')
                     , (2, '2019-01-17'), (2, '2019-01-21'), (2, '2019-01-25')
                     , (2, '2019-02-02'), (3, '2019-02-12'), (3, '2019-02-14')
                     , (3, '2019-02-18'), (3, '2019-02-23'), (3, '2019-03-04')
                     , (3, '2019-03-10')
            )
SELECT
          CUSTOMER_ID       AS CUSTOMER_ID
        , AVG(TIME_SINCE)   AS AVG_TIME_SINCE
FROM (
        SELECT
                  CUSTOMER_ID
                , DAY_ID
                , DATEDIFF(DAY
                         , LAG(DAY_ID)
                           OVER(PARTITION BY CUSTOMER_ID
                                    ORDER BY DAY_ID
                               ), DAY_ID
                          ) AS TIME_SINCE
        FROM CTE
        ORDER BY
                  CUSTOMER_ID
                , DAY_ID
      )
GROUP BY CUSTOMER_ID;

Upvotes: 2

Related Questions