Learn12
Learn12

Reputation: 216

How to refactor a subquery that has a join inside SQL Server

I have been tasked with removing the subqueries from one of our stored procedures.

There is one particular subquery that I do not know how to refactor and extract so I can eliminate it.

It starts with a LEFT JOIN and then inside, there is another JOIN that selects some records.

Here is the stripped down code, I'm not sure if you need the full code:

LEFT JOIN (
    SELECT
        T1.LOGID, T2.RGROUP
    FROM 
        TABLEONE T1 
    JOIN
        TABLETWO T2 ON T1.RID = T2.ID 
                    AND CAST(T2.VAL AS BIGINT) = CAST(@SOMEVAR AS BIGINT) 
) RR ON (RR.LOGID = T3.ID)

Can someone walk me through how to eliminate this subquery and replace it with joins on the main query?

Examples would be greatly appreciated!

Please let me know your thoughts,

Thank you!

Upvotes: 0

Views: 139

Answers (2)

Alex Sham
Alex Sham

Reputation: 458

You just need to add proper "WHERE" clause, replacing your subquery inner join condition^

LEFT JOIN TABLEONE T1   ON  T1.LOGID = T3.ID
LEFT JOIN TABLETWO T2   ON  T2.ID = T1.RID
                            AND CAST(T2.VAL as bigint) = CAST(@SOMEVAR as bigint) 
WHERE T2.ID IS NOT NULL OR T1.LOGID IS NULL

you see: you take all T3 rows and only those rows, that exist in T2, joined with T1

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270061

First, refactoring is not necessary from a performance perspective. SQL Server is smart enough to restructure the joins for the optimal plan.

Second, you can phrase this as:

T3 LEFT JOIN 
(T1 JOIN
 T2
 ON T1.RID = T2.ID and 
        CAST(T2.VAL as bigint) = CAST(@SOMEVAR as bigint)
)
ON T1.LOGID = T3.ID

The parentheses are important. In practice, you can often replace this with:

T3 LEFT JOIN 
T1 JOIN
ON T1.LOGID = T3.ID
T2
ON T1.RID = T2.ID and 
    CAST(T2.VAL as bigint) = CAST(@SOMEVAR as bigint)

But there is a subtle difference between the two. In the first, T1 might match T3 but not T2. In that case, the columns in T1 would not have any values. In the second, these would have values. The second behavior is more commonly what you want, but there may be some cases where you want the first behavior.

Upvotes: 3

Related Questions