strapped_gopherbro
strapped_gopherbro

Reputation: 53

Partition by Date Range in Window Function

I'm trying to write a query that returns, for each of the last 44 days, a count of the rentals made in the 7-day window preceding that day.

This is tricky because not all dates in the set are consecutive, and dates without rentals are not rows in the data set.

Here is where I am downloading the data from: https://www.postgresqltutorial.com/postgresql-sample-database/

I know this requires the use of a WINDOW function and most likely, the conjunction of both the PREDICT BY and ORDER BY clauses, but my results are returning duplicate dates and the count column is displaying the same value for every row. Here is my code:

SELECT DISTINCT date_trunc('day', rental_date), count(rental_id) OVER w
FROM rental
GROUP BY rental_date, rental_id
WINDOW w AS (PARTITION BY (rental_date BETWEEN DATE '2005-08-23' - INTERVAL '44days' AND DATE '2005-08-23') 
             ORDER BY rental_date ROWS BETWEEN 7 PRECEDING AND CURRENT ROW)
ORDER BY date_trunc('day', rental_date) DESC;

The expected output would look something like:

       Col1                            Col2                               
date_trunc1                count(rental_id) 
2006-02-21 00:00:00                     182
2006-02-20 00:00:00                     182
2006-02-19 00:00:00                     182
2006-02-18 00:00:00                     182
2006-02-17 00:00:00                     182
2006-02-16 00:00:00                     182                           
2006-02-15 00:00:00                     182
2005-08-30 00:00:00                     598
2005-08-29 00:00:00                    1224
2005-08-28 00:00:00                    1883
2005-08-27 00:00:00                    2507  
2005-08-26 00:00:00                    3135
2005-08-25 00:00:00                    3756    
2005-08-24 00:00:00                    4349
2005-08-23 00:00:00                    3374
2005-08-22 00:00:00                    3148
2005-08-21 00:00:00                    2489
2005-08-20 00:00:00                    1865
2005-08-19 00:00:00                    1237
2005-08-18 00:00:00                     616
2005-08-17 00:00:00                      23
2005-08-16 00:00:00                       0
2005-08-08 00:00:00                     671
2005-08-07 00:00:00                    1305

*excluding dates that are not rows in the dataset, of course. This is why the 7 days preceding 2006-02-14 are all counted as 0.

Upvotes: 2

Views: 23411

Answers (2)

Jonathan Jacobson
Jonathan Jacobson

Reputation: 1518

I haven't tested it but something like this should do.

WITH t AS (
    SELECT date_trunc('day', rental_date) rental_date, count(rental_id) cnt
    FROM rental
    WHERE rental_date >= CURRENT_DATE - INTERVAL '44 DAYS'
    GROUP BY 1
)
SELECT rental_date, SUM(cnt) OVER w
FROM t
WINDOW w AS (ORDER BY rental_date ROWS BETWEEN 7 PRECEDING AND CURRENT ROW)
ORDER BY rental_date DESC;

First you count rows per day and then you sum those up in order to get one row per day.

Upvotes: 1

Laurenz Albe
Laurenz Albe

Reputation: 246483

Since you didn't post any test data, I'll go by your description:

SELECT rental_date, count
FROM (SELECT rental_date::date,
             count(*) OVER (ORDER BY rental_date::date
                            RANGE BETWEEN INTERVAL '6 days' PRECEDING AND CURRENT ROW)
      FROM rental
      WHERE rental_date::date BETWEEN current_date - (44 + 7)
                              AND current_date - 1
     ) subq
WHERE rental_date >= current_date - 44
ORDER BY rental_date;

Of course rows will be conted several times, because that is what you say you want.

Upvotes: 5

Related Questions