StatsViaCsh
StatsViaCsh

Reputation: 2640

Temp table index/ performance help requested

This is a continuation of sorts from my prior post: RowNumber() and Partition By performance help wanted

I have a query that needs significant performance improvement. With the suggestions in the prior post, I eliminated all but one cte in my query, and implemented a temp table, with an index on it. It's still very slow... currently 40 min and counting, with no data returned yet. Some background info: the one table all the data comes from has about 5 million rows. There are a couple of indexes on it, including a unique nonclustered, which consists of columns Symbol, Period, and TradeDate, with the Value column included. I have two others exactly the same but with Period first, then TradeDate first. There is a unique clustered index on the table as well. What can speed this up? A different index on the temp table? Sorry for a semi- duplicate post.. I'm at a standstill here. Any help would be huge.

create table ##smaComp
(
    RowNum          bigint,
    Rank            bigint,
    TradeDate       Date,
    Symbol          Char(6),
    FastPer         int,
    FastVal         Decimal(9,4),
    SlowPer         int,
    SlowVal         Decimal(9,4),
    FastMinusSlow   Decimal(9,4)
)

;with sma as
(
    select t.TradeDate, t.Symbol, t.Period FastPer, t.Value FastVal, t2.Period SlowPer, 
        t2.Value SlowVal, (t.Value-t2.Value) FastMinusSlow
    from tblDailySMA t join tblDailySMA as t2 on t.Symbol = t2.Symbol 
        and t.TradeDate = t2.TradeDate and t2.Period > t.Period
)

insert into ##smaComp
(
    RowNum, Rank, TradeDate, Symbol, FastPer, FastVal, SlowPer, SlowVal, FastMinusSlow
)
select ROW_NUMBER() OVER (PARTITION BY sma.Symbol, sma.FastPer, sma.SlowPer
    ORDER BY sma.TradeDate) as RowNum, DENSE_RANK() OVER (ORDER BY sma.Symbol, sma.FastPer,
    sma.SlowPer) as Rank, sma.TradeDate, sma.Symbol, sma.FastPer, sma.FastVal, 
    sma.SlowPer, sma.SlowVal, sma.FastMinusSlow
from sma

CREATE UNIQUE NONCLUSTERED INDEX [IX_tblDailySMAClustered] ON ##smaComp
(RowNum, Rank)
INCLUDE (Symbol, TradeDate, FastPer, SlowPer, FastVal, SlowVal, FastMinusSlow)

select t.TradeDate as PriorDate, t.FastPer, t.FastVal, t.SlowPer, t.SlowVal,
    t.FastMinusSlow, t2.TradeDate as LatestDate, t2.FastPer, t2.FastVal, t2.SlowPer, 
    t2.SlowVal, t2.FastMinusSlow, (t2.FastMinusSlow * t2.FastMinusSlow) as Comparison
from ##smaComp t join ##smaComp t2
on t.Rank = t2.Rank and t.RowNum = (t2.RowNum - 1)

The execution plan(s), as requested:

StmtText
  create table ##smaComps  (   RowNum   bigint,   Rank   bigint,   TradeDate  Date,   Symbol   Char(6),   FastPer   int,   FastVal   Decimal(9,4),   SlowPer   int,   SlowVal   Decimal(9,4),   FastMinusSlow Decimal(9,4)  )
;with sma as  (   select t.TradeDate, t.Symbol, t.Period FastPer, t.Value FastVal, t2.Period SlowPer,     t2.Value SlowVal, (t.Value-t2.Value) FastMinusSlow   from tblDailySMA t join tblDailySMA as t2 on t.Symbol = t2.Symbol     and t.TradeDate = t2.TradeDate and t2.Period > t.Period  )    insert into ##smaComps  (   RowNum, Rank, TradeDate, Symbol, FastPer, FastVal, SlowPer, SlowVal, FastMinusSlow  )  select ROW_NUMBER() OVER (PARTITION BY sma.Symbol, sma.FastPer, sma.SlowPer   ORDER BY sma.TradeDate) as RowNum, DENSE_RANK() OVER (ORDER BY sma.Symbol, sma.FastPer,   sma.SlowPer) as Rank, sma.TradeDate, sma.Symbol, sma.FastPer, sma.FastVal,    sma.SlowPer, sma.SlowVal, sma.FastMinusSlow  from sma

StmtText
  |--Table Insert(OBJECT:([tempdb].[dbo].[##smaComps]), SET:([tempdb].[dbo].[##smaComps].[RowNum] = [Expr1009],[tempdb].[dbo].[##smaComps].[Rank] = [Expr1010],[tempdb].[dbo].[##smaComps].[TradeDate] = [Market].[dbo].[tblDailySMA].[TradeDate] as [t].[TradeDate],[tempdb].[dbo].[##smaComps].[Symbol] = [Expr1011],[tempdb].[dbo].[##smaComps].[FastPer] = [Market].[dbo].[tblDailySMA].[Period] as [t].[Period],[tempdb].[dbo].[##smaComps].[FastVal] = [Expr1012],[tempdb].[dbo].[##smaComps].[SlowPer] = [Market].[dbo].[tblDailySMA].[Period] as [t2].[Period],[tempdb].[dbo].[##smaComps].[SlowVal] = [Expr1013],[tempdb].[dbo].[##smaComps].[FastMinusSlow] = [Expr1014]))
       |--Compute Scalar(DEFINE:([Expr1014]=CONVERT_IMPLICIT(decimal(9,4),[Expr1008],0)))
            |--Top(ROWCOUNT est 0)
                 |--Compute Scalar(DEFINE:([Expr1011]=CONVERT_IMPLICIT(char(6),[Market].[dbo].[tblDailySMA].[Symbol] as [t].[Symbol],0), [Expr1012]=CONVERT_IMPLICIT(decimal(9,4),[Market].[dbo].[tblDailySMA].[Value] as [t].[Value],0), [Expr1013]=CONVERT_IMPLICIT(decimal(9,4),[Market].[dbo].[tblDailySMA].[Value] as [t2].[Value],0)))
                      |--Sequence Project(DEFINE:([Expr1010]=dense_rank))
                           |--Segment
                                |--Segment
                                     |--Sequence Project(DEFINE:([Expr1009]=row_number))
                                          |--Segment
                                               |--Compute Scalar(DEFINE:([Expr1008]=[Market].[dbo].[tblDailySMA].[Value] as [t].[Value]-[Market].[dbo].[tblDailySMA].[Value] as [t2].[Value]))
                                                    |--Parallelism(Gather Streams, ORDER BY:([t].[Symbol] ASC, [t].[Period] ASC, [t2].[Period] ASC, [t].[TradeDate] ASC))
                                                         |--Sort(ORDER BY:([t].[Symbol] ASC, [t].[Period] ASC, [t2].[Period] ASC, [t].[TradeDate] ASC))
                                                              |--Merge Join(Inner Join, MANY-TO-MANY MERGE:([t].[TradeDate], [t].[Symbol])=([t2].[TradeDate], [t2].[Symbol]), RESIDUAL:([Market].[dbo].[tblDailySMA].[Symbol] as [t].[Symbol]=[Market].[dbo].[tblDailySMA].[Symbol] as [t2].[Symbol] AND [Market].[dbo].[tblDailySMA].[TradeDate] as [t].[TradeDate]=[Market].[dbo].[tblDailySMA].[TradeDate] as [t2].[TradeDate] AND [Market].[dbo].[tblDailySMA].[Period] as [t2].[Period]>[Market].[dbo].[tblDailySMA].[Period] as [t].[Period]))
                                                                   |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([t].[TradeDate], [t].[Symbol]), ORDER BY:([t].[TradeDate] ASC, [t].[Symbol] ASC))
                                                                   |    |--Index Scan(OBJECT:([Market].[dbo].[tblDailySMA].[IX_tblDailySMA_TrDateNonClust] AS [t]), ORDERED FORWARD)
                                                                   |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([t2].[TradeDate], [t2].[Symbol]), ORDER BY:([t2].[TradeDate] ASC, [t2].[Symbol] ASC))
                                                                        |--Index Scan(OBJECT:([Market].[dbo].[tblDailySMA].[IX_tblDailySMA_TrDateNonClust] AS [t2]), ORDERED FORWARD)

StmtText
 CREATE UNIQUE NONCLUSTERED INDEX [IX_tblDailySMAClustered] ON ##smaComps  (RowNum, Rank)  INCLUDE (Symbol, TradeDate, FastPer, SlowPer, FastVal, SlowVal, FastMinusSlow)
 select t.TradeDate as PriorDate, t.FastPer, t.FastVal, t.SlowPer, t.SlowVal,   t.FastMinusSlow, t2.TradeDate as LatestDate, t2.FastPer, t2.FastVal, t2.SlowPer,    t2.SlowVal, t2.FastMinusSlow, (t2.FastMinusSlow * t2.FastMinusSlow) as Comparison  from ##smaComps t join ##smaComps t2  on t.Rank = t2.Rank and t.RowNum = (t2.RowNum - 1)

StmtText
  |--Hash Match(Inner Join, HASH:([t].[Rank], [t].[RowNum])=([t2].[Rank], [Expr1007]), RESIDUAL:([tempdb].[dbo].[##smaComps].[Rank] as [t].[Rank]=[tempdb].[dbo].[##smaComps].[Rank] as [t2].[Rank] AND [tempdb].[dbo].[##smaComps].[RowNum] as [t].[RowNum]=[Expr1007]))
       |--Table Scan(OBJECT:([tempdb].[dbo].[##smaComps] AS [t]))
       |--Compute Scalar(DEFINE:([Expr1006]=[tempdb].[dbo].[##smaComps].[FastMinusSlow] as [t2].[FastMinusSlow]*[tempdb].[dbo].[##smaComps].[FastMinusSlow] as [t2].[FastMinusSlow], [Expr1007]=[tempdb].[dbo].[##smaComps].[RowNum] as [t2].[RowNum]-(1)))
            |--Table Scan(OBJECT:([tempdb].[dbo].[##smaComps] AS [t2]))

Upvotes: 1

Views: 276

Answers (1)

usr
usr

Reputation: 171246

The query plan does not contain anything that can go particularly wrong (like nested loop explosion).

I think the reason is that you are cross-joining over all "periods". My guess is that your data contains many (100s?) periods per TradeDate and Symbol. This means that SQL Server has to process a quadratic amount of data. The predicate t2.Period > t.Period filters about half of the rows, but the other half remains.

So the data volume is extremely high by principle. Not sure if this can be optimized. Do you need all of the data or just a subset? If you need all of it I don't think anything can be done.

You can test this hypothesis by restricting the query to one TradeDate and Symbol and look at the row counts.

Upvotes: 1

Related Questions