Reputation: 5736
Hi I have a table that stores the mappings of my data, e.g.
Data Map
id | letter | type l | r
------------------ -----
1 | AA | HEAD 5 | 1
2 | BB | HEAD 2 | 1
3 | CC | HEAD 6 | 2
4 | DD | HEAD 3 | 2
5 | END-1 | END 7 | 3
6 | END-2 | END 8 | 4
7 | END-3 | END
8 | END-4 | END
See http://sqlfiddle.com/#!3/4eccfe/5
I want to find out all the END type links from a given source, e.g. for AA, I get END-1, END-2, END-3; for BB, I get END-1, END-2, END-3; for CC, I get END-1, END-2, END-3; for DD, I get END-4
I have written what I want using recursive CTE:
;WITH data(id, letter, type) AS (
SELECT '1', 'AA', 'HEAD' UNION SELECT '2', 'BB', 'HEAD' UNION SELECT '3', 'CC', 'HEAD' UNION
SELECT '4', 'DD', 'HEAD' UNION SELECT '5', 'END-1', 'END' UNION SELECT '6', 'END-2', 'END' UNION
SELECT '7', 'END-3', 'END' UNION SELECT '8', 'END-4', 'END'
), map (l, r) AS (
SELECT '5', '1' UNION SELECT '2', '1' UNION
SELECT '6', '2' UNION SELECT '3', '2' UNION
SELECT '7', '3' UNION SELECT '8', '4'
), my_list (origin, source, target, target_type, sid, tid, level) AS (
SELECT s.letter, s.letter, t.letter, t.type, s.id, t.id, 0
FROM data s JOIN map ON (s.id = l OR s.id = r)
JOIN data t ON (t.id = l OR t.id = r)
WHERE t.id <> s.id AND s.type <> 'END'
UNION ALL
SELECT my_list.origin, s.letter, t.letter, t.type, s.id, t.id, level + 1
FROM data s JOIN map ON (s.id = l OR s.id = r)
JOIN data t ON (t.id = l OR t.id = r) JOIN my_list ON s.id = my_list.tid
WHERE t.id <> s.id AND s.type <> 'END' AND t.id <> my_list.sid
)
SELECT * FROM my_list
WHERE origin = 'BB' AND target_type = 'END'
ORDER BY level
GO
But the performance is not really good (on my real tables). Then I realise it is the ORs in the join conditions that are causing the issue, then I tried using UNION
my_list (origin, source, target, target_type, sid, tid, level) AS (
SELECT s.letter, s.letter, t.letter, t.type, s.id, t.id, 0
FROM data s JOIN map ON s.id = l JOIN data t ON t.id = r
WHERE s.type <> 'END'
UNION ALL
SELECT s.letter, s.letter, t.letter, t.type, s.id, t.id, 0
FROM data s JOIN map ON s.id = r JOIN data t ON t.id = l
WHERE s.type <> 'END'
UNION ALL
SELECT my_list.origin, s.letter, t.letter, t.type, s.id, t.id, level + 1
FROM data s JOIN map ON (s.id = l OR s.id = r)
JOIN data t ON (t.id = l OR t.id = r) JOIN my_list ON s.id = my_list.tid
WHERE t.id <> s.id AND s.type <> 'END' AND t.id <> my_list.sid
)
The difference is huge (on my real tables, the time is halved). For the above examples, I got
Table 'Worktable'. Scan count 6, logical reads 100
vs
Table 'Worktable'. Scan count 5, logical reads 75
But then when I tried to do the same for the recursive part, e.g.
my_list (origin, source, target, target_type, sid, tid, level) AS (
SELECT s.letter, s.letter, t.letter, t.type, s.id, t.id, 0
FROM data s JOIN map ON s.id = l JOIN data t ON t.id = r
WHERE s.type <> 'END'
UNION ALL
SELECT s.letter, s.letter, t.letter, t.type, s.id, t.id, 0
FROM data s JOIN map ON s.id = r JOIN data t ON t.id = l
WHERE s.type <> 'END'
UNION ALL
SELECT my_list.origin, s.letter, t.letter, t.type, s.id, t.id, level + 1
FROM data s JOIN map ON s.id = l
JOIN data t ON t.id = r JOIN my_list ON s.id = my_list.tid
WHERE s.type <> 'END' AND t.id <> my_list.sid
UNION ALL
SELECT my_list.origin, s.letter, t.letter, t.type, s.id, t.id, level + 1
FROM data s JOIN map ON s.id = r
JOIN data t ON t.id = l JOIN my_list ON s.id = my_list.tid
WHERE s.type <> 'END' AND t.id <> my_list.sid
)
the result becomes slower (on my real tables, 5 times slower).
I wonder why is it slower and if there is other way to get rid of the ORs to speed up the query? Database is MS SQL SERVER 2008R2
Thank you
Upvotes: 2
Views: 166
Reputation: 7181
I might have gotten this wrong, but can't you push the predicate:
WHERE origin = 'BB'
inside the CTE. I.e.:
;WITH my_list (origin, source, target, target_type, sid, tid, level) AS (
SELECT s.letter, s.letter, t.letter, t.type, s.id, t.id, 0
FROM data s
JOIN map
ON s.id = l
JOIN data t
ON t.id = r
WHERE s.letter = 'BB'
UNION ALL
SELECT s.letter, s.letter, t.letter, t.type, s.id, t.id, 0
FROM data s
JOIN map
ON s.id = r
JOIN data t
ON t.id = l
WHERE s.letter = 'BB'
UNION ALL
SELECT my_list.origin, s.letter, t.letter, t.type, s.id, t.id, level + 1
FROM data s
JOIN map
ON (s.id = l OR s.id = r)
JOIN data t
ON (t.id = l OR t.id = r)
JOIN my_list
ON s.id = my_list.tid
WHERE t.id <> s.id
AND s.type <> 'END'
AND t.id <> my_list.sid
)
SELECT * FROM my_list
WHERE origin = 'BB' AND target_type = 'END'
ORDER BY level
Does that improve performance?
Upvotes: 1