SQL Server Query Join taking very long when adding an additional column in the join

Question

I'm running a basic query on a table with a unique ID in each table, which is a non-clustered index in each. One table has 8 million rows, and the other has 800,000 rows.

When I run the following, it returns 24,000 rows in less than a second:

select     
    a.[ID]      
    ,b.[ID]    
from     
    dbo.tbl_1 a    
join     
    dbo.tbl_2 b    
on 
    a.unique_id = b.unique_id

However, when I add an additional column in the join which will significantly reduce the recordset, this takes about 8 minutes

select     
    a.[ID]      
    ,b.[ID]    
from     
    dbo.tbl_1 a    
join     
    dbo.tbl_2 b    
on 
    a.unique_id = b.unique_id 
    AND a.code_letter = b.code_letter

The "code_letter" column is just one letter, and is set to varchar(1). I'm scratching my head trying to understand why this is hanging. The issue is that I've run a dynamic sql insert query with 20,000 permutations of joins, and it's taking way too long.

Edit After trying this many ways, I realized that a simple select * seems to work very efficiently, while selecting specific columns is the culprit. Here is the execution plan when I select *:

Here is the execution plan when I select specific columns:

Again, my join is exactly the same, but the column selections are different.

Srinika Pinnaduwage · Accepted Answer

OP said, he didn't get the expected results, and based on his observations, I'll provide a different solution.

~~What I would do is, execute the following and get data to a temp table~~

select a.[ID] as aID ,b.[ID] as bID ,a.code_letter as aCode_letter ,b.code_letter as bCode_letter into #t from dbo.tbl_1 a join dbo.tbl_2 b on a.unique_id = b.unique_id

and then exec the following

Select aID, bID from #t Where aCode_letter = bCode_letter

select DISTINCT a.*    
into #ta
from     
    dbo.tbl_1 a    
join     
    dbo.tbl_2 b    
on 
    a.unique_id = b.unique_id 

select DISTINCT b.*    
into #tb
from     
    dbo.tbl_1 a    
join     
    dbo.tbl_2 b    
on 
    a.unique_id = b.unique_id

And exec

Select a.ID, b.ID 
from #ta a
Inner Join #tb b 
on a.unique_id = b.unique_id and a.Code_letter = b.Code_letter

SQL Server Query Join taking very long when adding an additional column in the join

Answers (1)

Related Questions