Reputation: 494
We have two tables containing huge data. And we have to perform inner join between these two tables. Joining condition is based on 3 columns c1,c2,c3.
Need your help in writing an optimized query for the above scenario. Please refer to the data model below.
table1
c1,c2,c3,c4,c5
table2
c1,c2,c3
Appreciate your help. Thanks.
Upvotes: 0
Views: 193
Reputation: 723
An inner join will give you all the rows that exist in both table 1 and table 2 as well as any extra combinations that exist with the other columns present in table 2. A left semi join like @dimamah mentioned will give you only those rows that are present in the left table 1.
For the first solution where you should expect more rows
SELECT t1.*
FROM table1 t1
INNER JOIN table2 t2
ON (t1.c1=t2.c1 AND t1.c1 = t2.c2 AND t1.c3 = t2.c3)
For the second solution where you will get the same number of rows as t1
SELECT t1.*
FROM table1 t1
LEFT SEMI JOIN table2 t2
ON (t1.c1=t2.c1 AND t1.c1 = t2.c2 AND t1.c3 = t2.c3)
Upvotes: 0
Reputation: 2903
as it seems you are not using any new data from table2 and the logic is "return all rows in table1 that exist in table2". This is done by using left semi join
.
The number of records might increase if table2 is not uniqe (same combination of c1 c2 c3 appears more than once).
Upvotes: 1
Reputation: 771
If I understand Hortonworks' founder Owen O'Malley correctly, you may optimize it by writing
SELECT t1.* from table1 t1
JOIN (SELECT table2.* FROM table2) t2
ON (t1.c1=t2.c1 AND t1.c1 = t2.c2 AND t1.c3 = t2.c3)
Upvotes: 2