sqlnewbie1979
sqlnewbie1979

Reputation: 119

SQL Server Query Join taking very long when adding an additional column in the join

I'm running a basic query on a table with a unique ID in each table, which is a non-clustered index in each. One table has 8 million rows, and the other has 800,000 rows.

When I run the following, it returns 24,000 rows in less than a second:

select     
    a.[ID]      
    ,b.[ID]    
from     
    dbo.tbl_1 a    
join     
    dbo.tbl_2 b    
on 
    a.unique_id = b.unique_id 

However, when I add an additional column in the join which will significantly reduce the recordset, this takes about 8 minutes

select     
    a.[ID]      
    ,b.[ID]    
from     
    dbo.tbl_1 a    
join     
    dbo.tbl_2 b    
on 
    a.unique_id = b.unique_id 
    AND a.code_letter = b.code_letter

The "code_letter" column is just one letter, and is set to varchar(1). I'm scratching my head trying to understand why this is hanging. The issue is that I've run a dynamic sql insert query with 20,000 permutations of joins, and it's taking way too long.

Edit After trying this many ways, I realized that a simple select * seems to work very efficiently, while selecting specific columns is the culprit. Here is the execution plan when I select *:

select all query

Here is the execution plan when I select specific columns:

select specific columns

Again, my join is exactly the same, but the column selections are different.

Upvotes: 0

Views: 1273

Answers (1)

Srinika Pinnaduwage
Srinika Pinnaduwage

Reputation: 1042

OP said, he didn't get the expected results, and based on his observations, I'll provide a different solution.

What I would do is, execute the following and get data to a temp table

select     
    a.[ID] as aID      
    ,b.[ID] as bID
    ,a.code_letter as aCode_letter  
    ,b.code_letter as bCode_letter
into #t
from     
    dbo.tbl_1 a    
join     
    dbo.tbl_2 b    
on 
    a.unique_id = b.unique_id 

and then exec the following

Select aID, bID from #t Where aCode_letter = bCode_letter

select DISTINCT a.*    
into #ta
from     
    dbo.tbl_1 a    
join     
    dbo.tbl_2 b    
on 
    a.unique_id = b.unique_id 

select DISTINCT b.*    
into #tb
from     
    dbo.tbl_1 a    
join     
    dbo.tbl_2 b    
on 
    a.unique_id = b.unique_id 

And exec

Select a.ID, b.ID 
from #ta a
Inner Join #tb b 
on a.unique_id = b.unique_id and a.Code_letter = b.Code_letter

Upvotes: 1

Related Questions