TopCoder
TopCoder

Reputation: 4296

Hive : How does translate function works if string is NULL?

I am doing left outer join on two tables in hive .

table 1 : name1
table 2 : name2 , class ( only two values Y or N)

names in table1 is superset of table 2.

Final resultant table should have name1, class

my query is

select table1.name1 , translate(table2.class,NULL,'N') 
from table1 left outer join table2 
on (name1= name2);

will this work ?

I am also tried reexp_replace but it doesn't work

select table1.name1, regexp_replace(table2.class,NULL,'N') from table1 left outer join table2 on (name1= name2);

Upvotes: 0

Views: 7336

Answers (1)

Mark Grover
Mark Grover

Reputation: 4080

When I was writing the Hive translate UDF, I wanted to keep the behaviour the same as the translate UDF in PostGreSQL. Similar to PostGreSQL, if any of the arguments to the translate UDF are NULL, the result will be NULL.

This is also represented in the following lines of Hive source for the UDF:

if (arguments[0].get() == null || arguments[1].get() == null || arguments[2].get() == null) {
      return null;
    }

Full source is available here.

From what I understand from your question, you want to emit out 'N' when the name exists in table1 but not in table2. If I were doing it, I would do something like (untested):

SELECT
   table1.name1,
   COALESCE(table2.name2, 'N') as name2
FROM
   table1
   LEFT OUTER JOIN
   table2
   ON (table1.name1 = table2.name2);

You can learn more about the coalesce UDF at the Hive UDF wiki page.

Thanks to your question, I have updated the description of the translate UDF on the Hive UDF wiki page to clarify the case when one of the input parameters is NULL.

Upvotes: 1

Related Questions