Reputation: 19
I have inherited a stored procedure that utilizes a table variable to store data, then updates each row with a running total calculation. The order of the records in the table variable is very important, as we want the volume to be ordered highest to lowest (i.e. the running total will get increasingly larger as you go down the table).
My problem is, during the step where the table variable is updated, the running total seems to be calculating , but not in a way that the data in the table variable was previously sorted by (descending by highest volume)
DECLARE @TableVariable TABLE ([ID], [Volume], [SortValue], [RunningTotal])
--Populate table variable and order by the sort value...
INSERT INTO @TableVariable (ID, Volume, SortValue)
SELECT
[ID], [Volume], ABS([Volume]) as SortValue
FROM
dbo.VolumeTable
ORDER BY
SortValue DESC
--Set TotalVolume variable...
SELECT@TotalVolume = ABS(sum([Volume]))
FROM @TableVariable
--Calculate running total, update rows in table variable...I believe this is where problem occurs?
SET @RunningTotal = 0
UPDATE @TableVariable
SET @RunningTotal = RunningTotal = @RunningTotal + [Volume]
FROM @TableVariable
--Output...
SELECT
ID, Volume, SortValue, RunningTotal
FROM
@TableVariable
ORDER BY
SortValue DESC
The result is, the record that had the highest volume, that I would have expected the running total to calculate on first (thus running total = [volume]), somehow ends up much further down in the list. The running total seems to calculate randomly
Here is what I would expect to get:
But here is what the code actually generates:
Not sure if there is a way to get the UPDATE statement to be enacted on the table variable in such a way that it is ordered by volume desc? From what Ive read so far, it could be an issue with the sorting behavior of a table variable but not sure how to correct? Can anyone help?
Upvotes: 1
Views: 2674
Reputation: 81970
Take a peek at the Window functions offered in SQL
For example
Declare @YourTable table (ID int,Volume int)
Insert Into @YourTable values
(100,1306489),
(125,898426),
(150,907404)
Select ID
,Volume
,RunningTotal = sum(Volume) over (Order by Volume Desc)
From @YourTable
Order By Volume Desc
Returns
ID Volume RunningTotal
100 1306489 1306489
150 907404 2213893
125 898426 3112319
To be clear, The @YourTable is for demonstrative purposes only. There should be no need to INSERT your actual data into a table variable.
EDIT to Support 2008 (Good news is Row_Number() is supported in 2008)
Select ID
,Volume
,RowNr=Row_Number() over (Order by Volume Desc)
Into #Temp
From @YourTable
Select A.ID
,A.Volume
,RunningTotal = sum(B.Volume)
From #Temp A
Join #Temp B on (B.RowNr<=A.RowNr)
Group By A.ID,A.Volume
Order By A.Volume Desc
Upvotes: 2
Reputation: 131473
GarethD provided the definitive link to the multiple ways of calculating running totals and their performance. The correct one is both the simplest and fastest, 300 times faster that then quirky update. That's because it can take advantage of any indexes that cover the sort column, and because it's a lot simpler.
I repeat it here to make clear how much simpler this is when the database provided the appropriate windowing functions
SELECT
[Date],
TicketCount,
SUM(TicketCount) OVER (ORDER BY [Date] RANGE UNBOUNDED PRECEDING)
FROM dbo.SpeedingTickets
ORDER BY [Date];
The SUM
line means: Sum all ticket counts over all (UNBOUNDED) the rows that came before (PRECEDING) the current one if they were ordered by date
That ends up being 300 times faster than the quirky update.
The equivalent query for VolumeTable
would be:
SELECT
ID,
Volume,
ABS(Volume) as SortValue,
SUM(Volume) OVER (ORDER BY ABS(Volume) DESC RANGE UNBOUNDED PRECEDING)
FROM
VolumeTable
ORDER BY ABS(Volume) DESC
Note that this will be a lot faster if there is an index on the sort column (Volume), and ABS isn't used. Applying any function on a column means that the optimizer can't use any indexes that cover it, because the actual sort value is different than the one stored in the index.
If the table is very large and performance suffers, you could create a computed column and create an index on it
Upvotes: 2