Reputation: 53
I am just trying to understand the concept behind joining of 2 tables with an OR condition. My requirement is: I need to join 2 tables Table1 [colA, colB] and Table2 [colX, colY] on columns Table1.colA = Table2.colB but if colA is NULL the condition should be Table1.colB = Table2.colY.
Do I need to do join them separately and then do union? Or is there a way I can do it in one join? Note that I have millions of records in both tables and its a left join and the tables reside in HIVE. I don't have a reproducible example, just trying to understand the concept.
Upvotes: 0
Views: 93
Reputation: 1270873
Translate your conditions directly:
SELECT *
FROM table1 t1 JOIN
table2 t2
ON (t1.cola = t2.colb) or
(t1.cola is null and t1.colb = t2.coly)
Usually, or
is a performance killer in join
s. This wold often be expressed using two separate left join
s:
SELECT . . . , COALESCE(t2a.col, t2b.col) as col
FROM table1 t1 LEFT JOIN
table2 t2a
ON (t1.cola = t2.colb) LEFT JOIN
table2 t2b
ON t1.cola is null and t1.colb = t2.coly;
Upvotes: 0
Reputation: 573
While I'm not familiar with HiveQL, in SQL server this could be accomplished as follows:
SELECT *
FROM table1 t1
JOIN table2 t2
ON COALESCE(t1.cola, t1.colb) = CASE
WHEN t1.cola IS NULL THEN t2.coly
ELSE t2.colx
END
The logic should be fairly readable.
Upvotes: 2