Reputation: 53
I have a table (let's call it DiodeSales) that tells me the total number of diode sales I made, grouped by date, diode color, and country. This is a sample of this schema:
Date Color Country Sales June, 20 2016 00:00:00 Green US 1 June, 20 2016 00:00:00 Red Japan 1 June, 20 2016 00:00:00 Red US 1 June, 21 2016 00:00:00 Red US 1 June, 22 2016 00:00:00 Green US 1 June, 22 2016 00:00:00 Red US 1 June, 23 2016 00:00:00 Green US 1 June, 23 2016 00:00:00 Red Japan 1 June, 23 2016 00:00:00 Red US 1 June, 24 2016 00:00:00 Green US 1 June, 24 2016 00:00:00 Red US 1
I want to be able to have have an additional column that tells me how many diodes we've sold up until that point. So, for example, using the above data, the {June 23, Red, 1, US} row would have a total sales value of 4, because we've sold 4 red diodes in the US at that point.
I initially thought a cumulative sum would do the trick. So I wrote this: (sqlfiddle here)
SELECT
t1.Date,
t1.Color,
t1.Country,
t1.Sales,
SUM(t2.Sales) AS CumulativeSales
FROM DiodeSales AS t1
INNER JOIN DiodeSales AS t2
ON t1.Date >= t2.Date
AND t1.Color = t2.Color
AND t1.Country = t2.Country
GROUP BY
t1.Date,
t1.Color,
t1.Country
This gives me the cumulative sum, as expected, but it does not give me the total sales for a given color in a given country on a given day. In particular, because some specific days may have 0 sales in some country, they will not have a cumulative value associated to it. For example, consider the results of the previous table:
Date Color Country Sales CumulativeSales June, 20 2016 00:00:00 Green US 1 1 June, 20 2016 00:00:00 Red Japan 1 1 June, 20 2016 00:00:00 Red US 1 1 June, 21 2016 00:00:00 Red US 1 2 June, 22 2016 00:00:00 Green US 1 2 June, 22 2016 00:00:00 Red US 1 3 June, 23 2016 00:00:00 Green US 1 3 June, 23 2016 00:00:00 Red Japan 1 2 June, 23 2016 00:00:00 Red US 1 4 June, 24 2016 00:00:00 Green US 1 4 June, 24 2016 00:00:00 Red US 1 5
If I were to look for the column corresponding to Japan on June 24, I'd find nothing (because there was no Japan sale that day, so there is no Japan row for that day). I don't think there's a way to do this in SQL, but is it possible to populate this resulting table with values on days in which some countries had no sales? The starting table will always have at least one column for each day for some country.
I am aware I could just write a simple
SELECT SUM(Sales) FROM DiodeSales WHERE Date <= @someDate AND Color = @someColor AND Country = @someCountry
to get this information, but this is for a table that has to be formatted in that way for it to be used by another piece of already-made software.
EDIT: Someone mentioned this as a potential duplicate of Calculate a Running Total in SQL Server, but that post only addresses efficiency while calculating a running sum. I already have various ways of calculating this sum, but I'm looking for a way to fix the issue of missing day/country combinations for days when there were no sales in that country. For the above example, the fixed query would return this:
Date Color Country Sales CumulativeSales June, 20 2016 00:00:00 Green US 1 1 June, 20 2016 00:00:00 Red Japan 1 1 June, 20 2016 00:00:00 Red US 1 1 June, 21 2016 00:00:00 Green US 0 1 June, 21 2016 00:00:00 Red Japan 0 1 June, 21 2016 00:00:00 Red US 1 2 June, 22 2016 00:00:00 Green US 1 2 June, 22 2016 00:00:00 Red Japan 0 1 June, 22 2016 00:00:00 Red US 1 3 June, 23 2016 00:00:00 Green US 1 3 June, 23 2016 00:00:00 Red Japan 1 2 June, 23 2016 00:00:00 Red US 1 4 June, 24 2016 00:00:00 Green US 1 4 June, 24 2016 00:00:00 Red Japan 0 2 June, 24 2016 00:00:00 Red US 1 5
Upvotes: 2
Views: 1759
Reputation: 455
Try this
Select distinct Date into SalesDate From DiodeSales
SELECT S.Date,t.Color,t.Country,t.CumulativeSales
from DiodeSales t left join
(SELECt t1.Date,t1.Color,t1.Country,t1.Sales,
SUM(t2.Sales) AS CumulativeSales FROM DiodeSales AS t1
GROUP BY
t1.Date,
t1.Color,
t1.Country) t2 on
S.Date=t2.date
and t.Color=t2.color
and t.Country=t2.country
join
SalesDate S
on t.date=S.date
Upvotes: 0
Reputation: 8591
Try this:
SELECT [Date], Color, Country, Sales,
SUM(Sales) OVER(PARTITION BY Color, Country ORDER BY [Date] rows unbounded preceding) as RunningTotal
FROM YourTable
ORDER BY [Date], Color
It produces the output as expected.
[EDIT]
If you're looking for solution for missing dates, countries and colors, try this (replace @tmp
with the name of your table):
SELECT A.[Date], A.Color, A.Country, COALESCE(B.Sales, 0) AS Sales
, SUM(COALESCE(B.Sales, 0)) OVER(PARTITION BY A.Color, A.Country ORDER BY A.[Date] rows unbounded preceding) as RunningTotal
FROM (
SELECT [Date], Color, Country
FROM (SELECT DISTINCT [Date] FROM @tmp) AS q1 CROSS JOIN
(SELECT DISTINCT Color FROM @tmp) AS q2 CROSS JOIN
(SELECT DISTINCT Country FROM @tmp) AS q3
) AS A
LEFT JOIN @tmp AS B ON A.[Date] = B.[Date] AND A.Color= B.Color AND A.Country = B.Country
ORDER BY A.[Date], A.Color
Above query produces:
Date Color Country Sales RunningTotal
2016-06-20 Green Japan 0 0
2016-06-20 Green US 1 1
2016-06-20 Red Japan 1 1
2016-06-20 Red US 1 1
2016-06-21 Green US 0 1
2016-06-21 Green Japan 0 0
2016-06-21 Red US 1 2
2016-06-21 Red Japan 0 1
2016-06-22 Green Japan 0 0
2016-06-22 Green US 1 2
2016-06-22 Red Japan 0 1
2016-06-22 Red US 1 3
2016-06-23 Green US 1 3
2016-06-23 Green Japan 0 0
2016-06-23 Red US 1 4
2016-06-23 Red Japan 1 2
2016-06-24 Green Japan 0 0
2016-06-24 Green US 1 4
2016-06-24 Red Japan 0 2
2016-06-24 Red US 1 5
Upvotes: 3
Reputation: 455
I think you should use left join instead of inner join
SELECT
t.Date,
t.Color,
t.Country,
t.CumulativeSales
from DiodeSales t
left join
(SELECT
t1.Date,
t1.Color,
t1.Country,
t1.Sales,
SUM(t2.Sales) AS CumulativeSales
FROM DiodeSales AS t1
GROUP BY
t1.Date,
t1.Color,
t1.Country) t2
on
t.Date=t2.date
and t.Color=t2.color
and t.Country=t2.country
Upvotes: 2