Reputation: 651
I create temporal table in order to set level :
CREATE TABLE [#DesignLvl]
(
[DesignKey] INT,
[DesignLevel] INT
);
WITH RCTE AS
(
SELECT
*,
1 AS [Lvl]
FROM
[Design]
WHERE
[ParentDesignKey] IS NULL
UNION ALL
SELECT
[D].*,
[Lvl] + 1 AS [Lvl]
FROM
[dbo].[Design] AS [D]
INNER JOIN
[RCTE] AS [rc] ON [rc].[DesignKey] = [D].[ParentDesignKey]
)
INSERT INTO [#DesignLvl]
SELECT
[DesignKey], [Lvl]
FROM
[RCTE]
Once created, I used as LEFT JOIN in really big query as:
SELECT...
FROM..
LEFT JOIN [#DesignLvl] AS [dl] ON d.DesignKey = dl.DesignKey
WHERE ...
The query works, but performance has fallen and the query now is too slow. Is there any way to optimize this table?
Execution plan of CTE
I try to add CLUSTERED index as:
CREATE TABLE [#DesignLvl]
(
[DesignKey] INT,
[DesignLevel] INT
);
CREATE CLUSTERED INDEX ix_DesignLvl
ON [#DesignLvl] ([DesignKey], [DesignLevel]);
Also try:
CREATE TABLE [#DesignLvl]
( [DesignKey] INT INDEX IX1 CLUSTERED ,
[DesignLevel] INT INDEX IX2 NONCLUSTERED );
But I get same result, it took long to execute
Upvotes: 7
Views: 1237
Reputation: 5094
Your question is incomplete, "query is slow", but which part of query is slow ?
CTEQuery
or LEFT JOIN in really big query
I think Script of big query is required, along with details, like which table contain how many rows, their data type etc.
Throw more details about the big query.
Also let us know if any UDF is involve in join condition.
Why do you left join
Temp table ? WHY NOT INNER JOIN
Test the performance separately or CTE and Big Query.
Once use [D].[ParentDesignKey] is not null
in recursive part,
SELECT
[D].*,
[Lvl] + 1 AS [Lvl]
FROM
[dbo].[Design] AS [D]
INNER JOIN
[RCTE] AS [rc] ON [rc].[DesignKey] = [D].[ParentDesignKey]
and [D].[ParentDesignKey] is not null
NOTE : In CTE use only those columns which is require.
If it is possible to Pre- Calculate [Lvl]
,because Recursive CTE
performance is specially bad involving lot of Records.
How many rows will be process in each CTE Query on average ?
If temp table will hold more than 100 rows
then yes create Clustered index on it,
CREATE CLUSTERED INDEX ix_DesignLvl
ON [#DesignLvl] ([DesignKey], [DesignLevel]);
If you are not using [DesignLevel]
in join condition then remove from index.
Also ,Reveal index of table [dbo].[Design]
and few data of DesignKey and ParentDesignKey.
There are several reason for getting Index Scan
, one of them is Selectivity of Key
.
So one DesignKey
can have how many rows and one ParentDesignKey
can have how many rows ?
So depending upon above answer Create Composite Clustered Index
on both key of table [dbo].[Design]
So consider my answer is incomplete, I will update it accordingly.
Upvotes: 3
Reputation: 677
Make sure there are no nulls in DesignKey.ParentDesignKey and #DesignLv1.DesignKey columns and if so, use is not null constraint where you can. i have seen nulls to create cross joins.
If Design table is a transactional table that is being written to very frequently, rebuild indexes on this table frequently.
Create one non clustered index on Design.DesignKey and Design.ParentDesignKey in that sequence.
Create a non clustered index on #DesignLvl DesignKey.
If Design table is large ( > 10 million rows) and a whole bunch of columns, create a indexed view of the distinct columns that you need only for this query and use that.
Check System event log for disk read write failures on disk that has tempdb and (You should put the tempdb on either a RAID 1 or RAID 10 array as they're optimized for high-write applications.) from ( https://searchsqlserver.techtarget.com/tip/SQL-Server-tempdb-best-practices-increase-performance )
Upvotes: 0
Reputation: 21
The problem could be with the Design table being huge and joining it with itself without any main filter conditions resulted in scanning of the entire table.
Because you interested in only very few columns like designkey and parentdesignkey, try splitting the data population query (insert into #designlvl) into multiple parts.
Make sure you have a index (designkey,parentdesignkey)
INSERT INTO #DesignLevel
SELECT DISTINCT DesignKey, 1 FROM Design WHERE ParentDesignKey IS NULL
INSERT INTO #DesignLevel
SELECT DISTINCT ParentDesignKey, Lvl+1 FROM Design WHERE ParentDesignKey is NOT NULL
Upvotes: 1
Reputation: 1376
Have you tried changing the SELECT *
to just SELECT DesignLevel
, I found that for wide rows this was enough to change the execution plan to choose to use an eager spool for the index scan:
WITH RCTE AS
(
SELECT
[DesignKey],
1 AS [Lvl]
FROM
[Design]
WHERE
[ParentDesignKey] IS NULL
UNION ALL
SELECT
[D].[DesignKey],
[Lvl] + 1 AS [Lvl]
FROM
[dbo].[Design] AS [D]
INNER JOIN
[RCTE] AS [rc] ON [rc].[DesignKey] = [D].[ParentDesignKey]
)
INSERT INTO [#DesignLvl]
SELECT
[DesignKey], [Lvl]
FROM
[RCTE]
Plan and test SQL can be found here: https://www.brentozar.com/pastetheplan/?id=BymxTD4wV
Upvotes: 2
Reputation: 714
Have you tried memory optimized tables? I used them in a similar process (recursive CTE) and I had spectacular results. In SQL Server 2017 should be included in Standard Edition also. You first need to create a filegroup for memory-optimized data:
ALTER DATABASE MyDB
ADD FILEGROUP mem_data CONTAINS MEMORY_OPTIMIZED_DATA;
GO
ALTER DATABASE MyDB
ADD FILE (NAME = 'MemData', FILENAME = 'D:\Data\MyDB_MemData.ndf') TO FILEGROUP mem_data;
Then you create (or convert) your table:
CREATETABLE dbo.MemoryTable
(
Col1 INT IDENTITY PRIMARY KEY
...
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
Upvotes: 2
Reputation: 6002
As others have said, it's not entirely clear what part of your query is slow. Nor do we have any idea about the number of records (could be 100, could be 100 million) or the actual timings (you might consider 10 seconds for loading millions of rows slow?!?).
We also don't know how 'hard' your really big query
is; for all we know it could be slow without the LEFT OUTER JOIN
too.
Anyway, to get a better idea, what happens if you run this: (untested code, you might need to fix some things)
DECLARE @level int = 0,
@rowcount int
-- create working table to calculate levels
SELECT lvl = @level,
D.[DesignKey]
INTO #hierarchy
FROM [Design] D
WHERE D.[ParentDesignKey] IS NULL
SELECT @rowcount = @@ROWCOUNT
PRINT Convert(nvarchar, CURRENT_TIMESTAMP, 113) + ' - Loaded ' + Convert(nvarchar, @rowcount) + N' level ' + Convert(nvarchar, @rowcount) + ' records...'
CREATE UNIQUE CLUSTERED INDEX uq0 ON #hierarchy (lvl, [DesignKey])
WHILE @rowcount > 0
BEGIN
INSERT #hierarchy
SELECT lvl = @level + 1,
D.[DesignKey]
FROM #hierarchy t
JOIN [Design] D
ON D.[ParentDesignKey] = t.[DesignKey]
WHERE t.lvl = @level
SELECT @rowcount = @@ROWCOUNT,
@level = @level + 1
PRINT Convert(nvarchar, CURRENT_TIMESTAMP, 113) + ' - Loaded ' + Convert(nvarchar, @rowcount) + N' level ' + Convert(nvarchar, @rowcount) + ' records...'
END
GO
-- we now have a lvl value for each DesignKey but the index is backwards for future use; so add index in the other direction
PRINT Convert(nvarchar, CURRENT_TIMESTAMP, 113) + ' - re-indexing...'
CREATE UNIQUE INDEX uq1 ON #hiearchy ([DesignKey]) INCLUDE (lvl) WITH (FILLFACTOR = 100)
PRINT Convert(nvarchar, CURRENT_TIMESTAMP, 113) + ' - done.'
GO
PRINT Convert(nvarchar, CURRENT_TIMESTAMP, 113) + ' - Starting query...'
-- actual use:
;WITH DesignLvlCTE
AS (SELECT h.lvl, D.*
FROM [Design] D
JOIN #hierarchy h
ON h.[DesignKey] = D.[DesignKey])
SELECT...
INTO #result -- leave this in to exclude overhead time of client
FROM..
LEFT JOIN DesignLvlCTE AS [dl] ON d.DesignKey = dl.DesignKey
WHERE ...
PRINT Convert(nvarchar, CURRENT_TIMESTAMP, 113) + ' - Done fetching data.'
-- get results
SELECT * FROM #result
PRINT Convert(nvarchar, CURRENT_TIMESTAMP, 113) + ' - Done.'
-- DROP TABLE #result
Upvotes: 2
Reputation: 783
The performance could be slower because a clustered index on the dbo.Design
table is being accessed inside a nested loop. According to the cost estimate, the database is spending 66% of it's time scanning that index. Looping over that just makes it worse.
See related question
Consider changing the index on dbo.Design
to be non-clustered, or try creating another temporary table with a non-clustered index and use that for your recursive query:
CREATE TABLE [#DesignTemp]
(
ParentDesignKey INT,
DesignKey INT
);
-- Insert the data, then create the index.
INSERT INTO [#DesignTemp]
SELECT
ParentDesignKey,
DesignKey
FROM [dbo].[Design];
COMMIT;
-- Try this index, or create indexes for individual columns if the plan works better at high volumes.
CREATE NONCLUSTERED INDEX ix_DesignTemp1 ON [#DesignTemp] (ParentDesignKey, DesignKey);
CREATE TABLE [#DesignLvl]
(
[DesignKey] INT,
[DesignLevel] INT
);
WITH RCTE AS
(
SELECT
*,
1 AS [Lvl]
FROM
[DesignTemp]
WHERE
[ParentDesignKey] IS NULL
UNION ALL
SELECT
[D].*,
[Lvl] + 1 AS [Lvl]
FROM
[DesignTemp] AS [D]
INNER JOIN
[RCTE] AS [rc] ON [rc].[DesignKey] = [D].[ParentDesignKey]
)
INSERT INTO [#DesignLvl]
SELECT
[DesignKey], [Lvl]
FROM
[RCTE];
Upvotes: 4
Reputation: 403
try @table, you query with a memory temp table instead of a disk temp table
declare @DesignLvl table
(
[DesignKey] INT,
[DesignLevel] INT
);
WITH RCTE AS
(
SELECT
*,
1 AS [Lvl]
FROM
[Design]
WHERE
[ParentDesignKey] IS NULL
UNION ALL
SELECT
[D].*,
[Lvl] + 1 AS [Lvl]
FROM
[dbo].[Design] AS [D]
INNER JOIN
[RCTE] AS [rc] ON [rc].[DesignKey] = [D].[ParentDesignKey]
)
INSERT INTO @DesignLvl
SELECT
[DesignKey], [Lvl]
FROM
[RCTE]
may help a bit, how many rows are we talking about and what sql server edition? @@version?
Upvotes: 2
Reputation: 3595
According to my testing published on this article, a set-based loop could give you a performance improvement over a recursive CTE.
DECLARE @DesignLevel int = 0;
INSERT INTO [#DesignLvl]
SELECT [DesignKey], 1
FROM [RCTE];
WHILE @@ROWCOUNT > 0
BEGIN
SET @DesignLevel += 1;
INSERT INTO [#DesignLvl]
SELECT [D].[DesignKey], dl.DesignLevel
FROM [dbo].[Design] AS [D]
JOIN [#DesignLvl] AS [dl] ON [dl].[DesignKey] = [D].[ParentDesignKey]
WHERE dl.DesignLevel = @DesignLevel;
END;
Upvotes: 2