KOB
KOB

Reputation: 4565

How to not apply a where clause to a window function

I have a table that shows purchases of customers like this:

day,transaction_id,customer_id,price
2020-01-01,1,100,10
2020-01-02,2,200,20
2020-01-03,3,100,30
2020-01-04,4,200,40
2020-02-01,5,100,50
2020-02-02,6,200,60
2020-02-03,7,100,70
2020-02-04,8,200,80

I want a query that shows the sum of all transactions to date per user, but only for the transactions in February 2020.

If I do something like the following, then I believe only transactions in February are evaluated in the window function:

SELECT 
    transaction_id,
    SUM(price) OVER (PARTITION BY customer_id ORDER BY day) AS total
FROM 
    sales
WHERE 
    day BETWEEN DATE '2021-02-01' AND DATE '2021-02-28'

How can I compute this window function only for rows in February, but include all previous rows within the window function calculation?

This is just sample data, but in the actual data, there are a huge amount of sales, so there is a lot of redundant calculations of the window function for all rows outside of the specified timeframe.

EDIT:

I think my question was misunderstood as both answers so far reflect the opposite of what I want.

I want to return only trips in February, but the window function should make its calculation on ALL trips in January and February.

The result should look like this:

transaction_id,total
5,90
6,120
7,160
8,200

Upvotes: 0

Views: 536

Answers (3)

Naell
Naell

Reputation: 1

I think I got your question. I used T-SQL and created temp table and subquery

declare @from as date ='2020-02-01'
declare @to as date='2020-02-28'
select * from sales
select price, 
    (SELECT SUM(b.price) 
    FROM sales b 
    WHERE b.customer_id = a.customer_id and b.transaction_id <= a.transaction_id) as runningTotal,
    transaction_id, 
    customer_id, 
    day 
INTO #Temp from sales a

select * from #Temp where [day] between @from and @to order by customer_id, transaction_id
drop table #temp

enter image description here

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1271231

I want a query that shows the sum of all transactions to date per user, but only for the transactions in February 2020.

If I understand correctly, you only want to show the transactions in February but you want the sum to be all transactions. If so, then use a subquery:

select s.*
from (select s.*,
             sum(price) over (partition by customer_id order by day) as running_price
      from sales s
     ) s
where day >= date '2021-02-01' and date < date '2021-03-01';

Note the date comparison using the first of the months and inequalities. This will work for both 2020 (a leap year) and 2021 (not a leap year) -- and ironically both years are mentioned in your question.

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522817

Use a conditional summation:

SELECT 
    transaction_id,
    SUM(CASE WHEN day BETWEEN DATE '2021-02-01' AND DATE '2021-02-28'
             THEN price ELSE 0 END) OVER (PARTITION BY customer_id ORDER BY day) AS total
FROM sales

Upvotes: 0

Related Questions