Kaggle
Kaggle

Reputation: 123

How to derive a new column based on other columns comparisions in Hive?

I have huge table and 2 columns A and B in hive.
Rows are identical where either A or B or both have same values.
I would like to build a new column and assign a value based on this comparision :

A    B
--   --
a    b
a    c
d    b
p    q

Result :

A    B    New_Col
--   --   -----
a    b     id1
a    c     id1
d    b     id1
p    q     id2

Any efficient solution?

Upvotes: 1

Views: 1729

Answers (1)

Edward R. Mazurek
Edward R. Mazurek

Reputation: 2177

You can achieve this by using conditional functions in Hive in your SELECT statement:

SELECT A, B, IF(A == 'a' OR B == 'b', 'id1', 'id2') AS New_Col FROM huge_table;

Here's how to create a new_huge_table from your huge_table with the new, derived column New_Col:

CREATE TABLE my_database.new_huge_table (A STRING, B STRING, New_Col STRING);
INSERT OVERWRITE TABLE my_database.new_huge_table
SELECT A, B, IF(A == 'a' OR B == 'b', 'id1', 'id2') AS New_Col FROM huge_table;

Upvotes: 1

Related Questions