Reputation: 8693
Is there any way to improve the performance of the following CTE query (@E
and @R
are tables with indexes in the actual system):
DECLARE @id bigint = 1
DECLARE @E TABLE
(
id bigint,
name varchar(50)
)
DECLARE @R TABLE
(
child_id bigint,
parent_id bigint
)
INSERT INTO @E SELECT 1, 'one'
INSERT INTO @E SELECT 2, 'two'
INSERT INTO @E SELECT 3, 'three'
INSERT INTO @E SELECT 4, 'four'
INSERT INTO @E SELECT 5, 'five'
INSERT INTO @E SELECT 6, 'six'
INSERT INTO @E SELECT 7, 'seven'
INSERT INTO @R SELECT 1, 2
INSERT INTO @R SELECT 1, 3
INSERT INTO @R SELECT 3, 4
INSERT INTO @R SELECT 5, 4
INSERT INTO @R SELECT 3, 6
INSERT INTO @R SELECT 7, 4
; WITH cte
(
child_id,
parent_id
)
AS (
SELECT * FROM @R R
WHERE R.child_id = @id
UNION ALL
SELECT R.* FROM @R R
INNER JOIN cte ON CTE.parent_id = R.child_id
)
SELECT * FROM @E E
WHERE e.id = @id
UNION ALL
SELECT P.* FROM @E E
INNER JOIN cte ON 1=1
INNER JOIN @E P ON P.id = cte.parent_id
WHERE e.id = @id
ORDER BY 1
Expected Results:
id | name
1 | one
2 | two
3 | three
4 | four
6 | six
In real world data I will be dealing with many millions of rows in @R and about a hundred thousand rows in in @E. So I'm looking to see if there's anything I can do to squeeze a little more performance out.
Edit: just to clarify and summarize so far, there is a clustered pk index on R with child_id, parent_id
and adding an index to @r.parent_id
will also improve join performance.
Is there anything improve this? The bit after the CTE with the inner join 1=1
is there anything that could be improved upon here is is that about as good as it will get? Is there any other schema design I could do to get similar parent-child mapping with better performance?
Upvotes: 4
Views: 8920
Reputation: 8693
As marc_s pointed out
a clustered index on [child_id, parent_id]
for this table is not good enough - you should have separate indices on both (child_id)
and (parent_id)
to speed up JOIN performance. If you have a compound index on (child_id, parent_id)
in that order, then this can be used for child_id
alone - but not for parent_id
alone (and it will need both) – marc_s Dec 19 '11 at 12:35
This greatly improved the performance of the query and helped me understand how CTE queries work internally.
Upvotes: 3
Reputation: 21766
For the very beginning:
DECLARE @E TABLE
(
id BIGINT PRIMARY KEY,
name varchar(50)
)
DECLARE @R TABLE
(
child_id bigint,
parent_id BIGINT,
PRIMARY KEY(child_id, parent_id),
UNIQUE (parent_id, child_id)
)
But, keep in mind that Sql Server is very poor in optimizing CTEs.
Upvotes: 0