Reputation: 1133
I'm having a little problem getting SQL Server returning the same results as Oracle for our SQL tree query.
We have a link table with attributes
CHILD_ID, LINK_ID, PARENT_ID
and a element table with ELEMENT_ID
In Oracle we use this
SELECT * FROM LINKS
JOIN ELEMENTS ON ELEMENT_ID = CHILD_ID
WHERE LINK_ID IN
(SELECT LINK_ID FROM LINKS CONNECT BY PRIOR CHILD_ID = PARENT_ID START WITH PARENT_ID = 'startid')
In SQL Server we use this
WITH TREE_LINKS AS
(SELECT CHILD_ID, LINK_ID FROM LINKS WHERE PARENT_ID = 'startid'
UNION ALL
SELECT CURRENT_LINKS.CHILD_ID, CURRENT_LINKS.LINK_ID
FROM LINKS CURRENT_LINKS
INNER JOIN TREE_LINKS t1 ON CURRENT_LINKS.PARENT_ID = t1.CHILD_ID)
SELECT * FROM TREE_LINKS
INNER JOIN LINKS ON TREE_LINKS.LINK_ID = LINKS.LINK_ID
INNER JOIN ELEMENTS ON ELEMENTS.ELEMENT_ID = TREE_LINKS.CHILD_ID
This works perfectly fine apart from 1 issue.
In Oracle we only get each unique link, based on LINK_ID
. In SQL Server we get all the links describing the full tree which can include duplicates when 1 element exists below multiple other elements in different branches of the structure.
This means we get a large amounts of duplicate rows from SQL Server. I tested adding
SELECT DISTINCT TREE_LINKS.LINK_ID AS TREE_LINK_ID, * from TREE_LINKS
in the last select statement but it just takes as long as the server has more work to remove the duplicates if found in different branches.
In 1 test case at the moment we have Oracle returning 20,000 rows and SQL Server returning 1.6 million rows. So far I have found no way to get SQL Server returning the same results as fast.
FYI : Adding DISTINCT in the recursion causes
DISTINCT operator is not allowed in the recursive part of a recursive common table expression 'TREE_LINKS'.
Edit: - An example
If we have links like this
PARENT_ID, LINK_ID, CHILD_ID
1 1 2
2 2 3
3 3 4
1 4 3
There are 4 unique elements, but there are 6 elements in the complete tree. This is because there are 2 paths to element 3
Upvotes: 0
Views: 663
Reputation: 3535
SELECT LINK_ID
FROM LINKS
START WITH PARENT_ID = 'startid'
CONNECT BY PRIOR CHILD_ID = PARENT_ID
should be equivalent to
EDIT BELOW (AND in first edit): changed a PARENT_ID to LINK_ID in the second select
WITH TREE_LINKS(CHILD_ID, LINK_ID) AS
(SELECT CHILD_ID, LINK_ID
FROM LINKS
WHERE PARENT_ID = 'startid'
UNION ALL
SELECT NLINKS.CHILD_ID, NLINKS.LINK_ID
FROM LINKS as NLINKS, TREE_LINKS
WHERE TREE_LINKS.CHILD_ID = NLINKS.PARENT_ID)
SELECT LINK_ID FROM TREE_LINKS
according to Simulation of CONNECT BY PRIOR of ORACLE in SQL SERVER
The rest should be similar from what I can tell.
EDIT: maybe something like:
WITH TREE_LINKS(CHILD_ID, LINK_ID) AS
(SELECT CHILD_ID, LINK_ID
FROM LINKS
WHERE PARENT_ID = 'startid'
UNION ALL
SELECT NLINKS.CHILD_ID, NLINKS.LINK_ID
FROM LINKS as NLINKS, TREE_LINKS
WHERE TREE_LINKS.CHILD_ID = NLINKS.PARENT_ID)
SELECT * FROM LINKS as TLINKS
JOIN ELEMENTS ON ELEMENT_ID = TLINKS.CHILD_ID
WHERE TLINKS.LINK_ID IN
(SELECT TREE_LINKS.LINK_ID FROM TREE_LINKS)
but I don't have mssql or oracle to test it on currently
EDIT: removed "AND TREE_LINKS.LINK_ID <> NLINKS.LINK_ID" from query as it was not needed.
Upvotes: 1