Calculate a 3-month moving average from non-aggregated data

Question

I have a bunch of orders. Each order is either a type A or type B order. I want a 3-month moving average of time it takes to ship orders of each type. How can I aggregate this order data into what I want using Redshift or Postgres SQL?

Start with this:

order_id	order_type	ship_date	time_to_ship
1	a	2021-12-25	100
2	b	2021-12-31	110
3	a	2022-01-01	200
4	a	2022-01-01	50
5	b	2022-01-15	110
6	a	2022-02-02	100
7	a	2022-02-28	300
8	b	2022-04-05	75
9	b	2022-04-06	210
10	a	2022-04-15	150

Note: Some months have no shipments. The solution should allow for this.

I want this:

order_type	ship__month	mma3_time_to_ship
a	2022-02-01	150
a	2022-04-01	160
b	2022-04-01	126.25

Where a 3-month moving average is only calculated for months with at least 2 preceding months. Each record is an order type-month. The ship_month columns denotes the month of shipment (Redshift represents months as the date of the first of the month).

Here's how the mma3_time_to_ship column is calculated, expressed as Excel-like formulas:

150 = AVERAGE(100, 200, 50, 100, 300) <- The average for all A orders in Dec, Jan, and Feb.

160 = AVERAGE(200, 50, 100, 300, 150) <- The average for all A orders in Jan, Feb, Apr (no orders in March)

126.25 = AVERAGE(110, 110, 75, 210) <- The average for all B orders in Dec, Jan, Apr (no B orders in Feb, no orders at all in Mar)

My attempt doesn't aggregate it into monthly data and 3-month averages (this query runs without error in Redshift):

SELECT
  order_type,
  DATE_TRUNC('month', ship_date) AS ship_month,
  AVG(time_to_ship) OVER (
    PARTITION BY
      order_type,
      ship_month
    ORDER BY ship_date
    ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
  ) AS avg_time_to_ship
FROM tbl

Is what I want possible?

Calculate a 3-month moving average from non-aggregated data

Answers (1)

Related Questions