Reputation: 123
I have huge table and 2 columns A and B in hive.
Rows are identical where either A or B or both have same values.
I would like to build a new column and assign a value based on this comparision :
A B
-- --
a b
a c
d b
p q
Result :
A B New_Col
-- -- -----
a b id1
a c id1
d b id1
p q id2
Any efficient solution?
Upvotes: 1
Views: 1729
Reputation: 2177
You can achieve this by using conditional functions in Hive in your SELECT statement:
SELECT A, B, IF(A == 'a' OR B == 'b', 'id1', 'id2') AS New_Col FROM huge_table;
Here's how to create a new_huge_table
from your huge_table
with the new, derived column New_Col
:
CREATE TABLE my_database.new_huge_table (A STRING, B STRING, New_Col STRING);
INSERT OVERWRITE TABLE my_database.new_huge_table
SELECT A, B, IF(A == 'a' OR B == 'b', 'id1', 'id2') AS New_Col FROM huge_table;
Upvotes: 1