Reputation: 33
I have this table
UserID | Date | Sale |
---|---|---|
A | 2021-05-01 | 3 |
A | 2021-05-03 | 1 |
A | 2021-05-03 | 2 |
A | 2021-05-05 | 5 |
B | 2021-05-02 | 4 |
B | 2021-05-03 | 10 |
What I need is something that looks like this.
UserID | Date | DailySale | RunningSale |
---|---|---|---|
A | 2021-05-01 | 3 | 3 |
A | 2021-05-02 | NULL | 3 |
A | 2021-05-03 | 3 | 6 |
A | 2021-05-04 | NULL | 6 |
A | 2021-05-05 | 5 | 11 |
B | 2021-05-01 | NULL | 0 |
B | 2021-05-02 | 4 | 4 |
B | 2021-05-03 | 10 | 14 |
B | 2021-05-04 | NULL | 14 |
B | 2021-05-05 | NULL | 14 |
I need to join on itself with all the dates in a certain time period so I can create a running sum sales total by date. I figured out how to do it all separately, I know how to do a running sum using (over partition by) and I know I can join a calendar table to my sales table to get the time period. But I want to try the self join method by distinct(datetime), and I'm not certain how to go about that. I've tried this, but it doesn't work for me. I have over 1 million rows, so it takes over 2 minutes to finished processing and the running-sum column looks exactly like the daily-sum column.
What's the best way to go about this?
Edit: Corrected Table Sums
Upvotes: 2
Views: 1300
Reputation: 522161
You need a calendar table here containing all dates. Consider the following approach:
WITH dates AS (
SELECT '2021-05-01' AS Date UNION ALL
SELECT '2021-05-02' UNION ALL
SELECT '2021-05-03' UNION ALL
SELECT '2021-05-04' UNION ALL
SELECT '2021-05-05'
)
SELECT
u.UserID,
d.Date,
SUM(t.Sale) AS DailySale,
SUM(COALESCE(SUM(t.Sale), 0)) OVER (PARTITION BY u.UserID ORDER BY d.Date) AS RunningSale
FROM (SELECT DISTINCT UserID FROM yourTable) u
CROSS JOIN dates d
LEFT JOIN yourTable t
ON t.UserID = u.UserID AND t.Date = d.Date
GROUP BY
u.UserID,
d.Date
ORDER BY
u.UserID,
d.Date
Upvotes: 3