wishmaster
wishmaster

Reputation: 1611

Using replace function in select with join

Background. I'm using SQL Server. I have two tables in database:

Vendors(Id, Name, Description)
Products(Id, VendorId, Name, Description)

Values in Id column are formatted with prefix 'ID_' in Vendor table.

Values in VendorId column are formatted with prefix 'VE_' in Products table.

E.g. 'VE_001245' in Products refers to 'ID_001245' in Vendors.

(Please, do not propose to change this concept, don't care about the database scheme, and don't suggest adding a foreign key. All of it is just for illustration.)

Question: which one of the following queries is best in the performance context and why?

  1. Use replace function in inner select:

     select v.* from Vendors v
     inner join
     (
         select distinct replace(VendorId, 'VE_', 'ID_') as Id
         from Products
     ) list
     on v.Id = list.Id
    
  2. Use replace function in on-statement:

     select v.* from Vendors v
     inner join
     (
         select distinct VendorId as Id
         from Products
     ) list
     on v.Id = replace(list.Id, 'VE_', 'ID_')
    

There is only a clustered index in each table (by Id column). Each table can contain millions of rows.

Upvotes: 1

Views: 17454

Answers (1)

Sonam
Sonam

Reputation: 3466

Both the queries are almost same in terms of performance. In the first query sorting is done twice, once when you are selecting the distinct records and again when it is performing an inner join, and in the end a merge join is there to select the final result set. Whereas in second query sorting is done only once but Hash join is being performed which is more expensive then merge join. So both the queries are same performance wise in the scenario when you don't have any index on the table.

Upvotes: 0

Related Questions