Reputation: 4296
I am doing left outer join on two tables in hive .
table 1 : name1
table 2 : name2 , class ( only two values Y or N)
names in table1 is superset of table 2.
Final resultant table should have name1, class
my query is
select table1.name1 , translate(table2.class,NULL,'N')
from table1 left outer join table2
on (name1= name2);
will this work ?
I am also tried reexp_replace but it doesn't work
select table1.name1, regexp_replace(table2.class,NULL,'N') from table1 left outer join table2 on (name1= name2);
Upvotes: 0
Views: 7336
Reputation: 4080
When I was writing the Hive translate UDF, I wanted to keep the behaviour the same as the translate UDF in PostGreSQL. Similar to PostGreSQL, if any of the arguments to the translate UDF are NULL, the result will be NULL.
This is also represented in the following lines of Hive source for the UDF:
if (arguments[0].get() == null || arguments[1].get() == null || arguments[2].get() == null) {
return null;
}
Full source is available here.
From what I understand from your question, you want to emit out 'N' when the name exists in table1 but not in table2. If I were doing it, I would do something like (untested):
SELECT
table1.name1,
COALESCE(table2.name2, 'N') as name2
FROM
table1
LEFT OUTER JOIN
table2
ON (table1.name1 = table2.name2);
You can learn more about the coalesce UDF at the Hive UDF wiki page.
Thanks to your question, I have updated the description of the translate UDF on the Hive UDF wiki page to clarify the case when one of the input parameters is NULL.
Upvotes: 1