Mike Kshymensky
Mike Kshymensky

Reputation: 111

Joining Onto CTE Performance

I have a stored procedure where I use a Common Table Expression to build a hierarchical path up a menu (so it can display something like Parent Menu -> Sub Menu -> Sub Sub Menu -> ...)

It works great for what I want to use it for, the issue comes when putting the information I get from the recursive CTE into the information I really want. I do an Inner Join from my Data to the CTE and get out the Hierarchical Path. For something that returns ~300 rows, the stored procedure takes on average 15-20 seconds.

When I insert the results from the CTE into a Temp Table and do the join based on that, the procedure takes less than a second.

I was just wondering why it takes so long to join using only the CTE, or if I am misusing CTE's in some way.

**Edit this is the stored procedure essentially

With Hierarchical_Path (Menu_ID, Parent_ID, Path) 

As
(
Select
    EM.Menu_Id, Parent_ID, 
            Convert(varchar(max), 
            EM.Description) as Path
From
    Menu EM
Where
--EM.Topic_No is null
    EM.Parent_ID = 0 and EM.Disabled = 0
Union All
Select  
    EM.Menu_ID,  
            EM.Parent_ID, 
            Convert(Varchar(max),E.Path + ' -> ' + EM.Description) as Path
From
    Menu EM
Inner Join
    Hierarchical_Path E
On
    EM.Parent_ID = E.Menu_ID    
)

SELECT distinct   
    EM.Description
    ,EMS.Path
FROM
    dbo.Menu em
INNER JOIN
    Hierarchical_Path EMS
ON
    EMS.Menu_ID = em.Menu_Id
    2 more INNER JOINs
    2 Left Joins
    WHERE Clause

When I run the query like this (joining onto the CTE) the performance is around 20 seconds.

When I insert the CTE results into a temp table, and join onto that, the performance is instantaneous.

Taking apart my query a bit more, it seems like it gets hung up on the where clause. I guess my question is more to the point of when exactly does a CTE run and does it get stored in memory? I was running under the assumption that it gets called once and then sticks around in memory, but under some circumstances could it be called mulitple times?

Upvotes: 3

Views: 2122

Answers (1)

Zhenny
Zhenny

Reputation: 831

The difference is a CTE is not persisted and a temporary table is (at least for the session). Joining on a non-persisted column means SQL has no stats on the data at all compared to the same column in a temporary table which is already pre-evaluated. Basically, the temp table caches what you would use and SQL Server can better optimize for it. The same issues are run into when joining on the result of a function or a table variable.

My guess is that your CTE execution plan is doing the execution with a single thread while your temp table can use multiple threads. You can check this by including actual execution plan when you run the queries and looking for two horizontal arrows pointing in opposite directions on each operator. That indicates parallelism.

P.S. - Try setting "set statistics io on" and "set statistics time on" to see if the actual cost of running the queries are the same regardless of run duration.

Upvotes: 2

Related Questions