Spartan46236
Spartan46236

Reputation: 19

Using a running total calculated column in SQL Server table variable

I have inherited a stored procedure that utilizes a table variable to store data, then updates each row with a running total calculation. The order of the records in the table variable is very important, as we want the volume to be ordered highest to lowest (i.e. the running total will get increasingly larger as you go down the table).

My problem is, during the step where the table variable is updated, the running total seems to be calculating , but not in a way that the data in the table variable was previously sorted by (descending by highest volume)

DECLARE @TableVariable TABLE ([ID], [Volume], [SortValue], [RunningTotal])

--Populate table variable and order by the sort value...
INSERT INTO @TableVariable (ID, Volume, SortValue)
   SELECT   
       [ID], [Volume], ABS([Volume]) as SortValue
   FROM  
       dbo.VolumeTable  
   ORDER BY 
       SortValue DESC

--Set TotalVolume variable...
SELECT@TotalVolume = ABS(sum([Volume])) 
FROM @TableVariable

--Calculate running total, update rows in table variable...I believe this is where problem occurs?
SET @RunningTotal = 0

UPDATE @TableVariable
SET @RunningTotal = RunningTotal = @RunningTotal + [Volume]
FROM @TableVariable

--Output...
SELECT  
    ID, Volume, SortValue, RunningTotal 
FROM    
    @TableVariable      
ORDER BY 
    SortValue DESC

The result is, the record that had the highest volume, that I would have expected the running total to calculate on first (thus running total = [volume]), somehow ends up much further down in the list. The running total seems to calculate randomly

Here is what I would expect to get:

Good result

But here is what the code actually generates:

enter image description here

Not sure if there is a way to get the UPDATE statement to be enacted on the table variable in such a way that it is ordered by volume desc? From what Ive read so far, it could be an issue with the sorting behavior of a table variable but not sure how to correct? Can anyone help?

Upvotes: 1

Views: 2674

Answers (2)

John Cappelletti
John Cappelletti

Reputation: 81970

Take a peek at the Window functions offered in SQL

For example

Declare @YourTable table (ID int,Volume int)
Insert Into @YourTable values
(100,1306489),
(125,898426),
(150,907404)

Select ID
      ,Volume
      ,RunningTotal = sum(Volume) over (Order by Volume Desc)
 From  @YourTable 
 Order By Volume Desc

Returns

ID  Volume  RunningTotal
100 1306489 1306489
150 907404  2213893
125 898426  3112319

To be clear, The @YourTable is for demonstrative purposes only. There should be no need to INSERT your actual data into a table variable.

EDIT to Support 2008 (Good news is Row_Number() is supported in 2008)

 Select ID
        ,Volume
        ,RowNr=Row_Number() over (Order by Volume Desc)
  Into  #Temp
  From  @YourTable 


 Select A.ID
       ,A.Volume
       ,RunningTotal = sum(B.Volume)
  From  #Temp A 
  Join  #Temp B on (B.RowNr<=A.RowNr)
  Group By A.ID,A.Volume
  Order By A.Volume Desc

Upvotes: 2

Panagiotis Kanavos
Panagiotis Kanavos

Reputation: 131473

GarethD provided the definitive link to the multiple ways of calculating running totals and their performance. The correct one is both the simplest and fastest, 300 times faster that then quirky update. That's because it can take advantage of any indexes that cover the sort column, and because it's a lot simpler.

I repeat it here to make clear how much simpler this is when the database provided the appropriate windowing functions

SELECT
    [Date],
    TicketCount,
    SUM(TicketCount) OVER (ORDER BY [Date] RANGE UNBOUNDED PRECEDING)
FROM dbo.SpeedingTickets
ORDER BY [Date];

The SUM line means: Sum all ticket counts over all (UNBOUNDED) the rows that came before (PRECEDING) the current one if they were ordered by date

That ends up being 300 times faster than the quirky update.

The equivalent query for VolumeTable would be:

SELECT   
   ID, 
   Volume, 
   ABS(Volume) as SortValue,
   SUM(Volume) OVER (ORDER BY ABS(Volume) DESC RANGE UNBOUNDED PRECEDING)
FROM  
   VolumeTable
ORDER BY ABS(Volume) DESC

Note that this will be a lot faster if there is an index on the sort column (Volume), and ABS isn't used. Applying any function on a column means that the optimizer can't use any indexes that cover it, because the actual sort value is different than the one stored in the index.

If the table is very large and performance suffers, you could create a computed column and create an index on it

Upvotes: 2

Related Questions