Reputation: 33
I have got a Hive Table named Data which is partitioned on daily date, in which I have 3 columns: name
,sex
,city
,age
.
Now I want to add another column named AgeRange
to this existing Hive table in which the values will be dependent based on the data of column Age
.
If Age < 18
, value will be 0
. If Age >= 18 and <= 23
, value will be 1
, else value will be 2
.
How can I write the query for this? This needs to be done for the past data i.e date till yesterday.
Upvotes: 1
Views: 12488
Reputation: 38290
If your Hive does not support Update
or the table is huge, then create new table with additional column and insert overwrite from old table:
insert overwrite table new_table
select
name,
sex,
city,
age,
case when Age < 18 then 0
when Age >= 18 and Age <= 23 then 1
else 2
end AgeRange
from old_table;
then drop old table and rename new table.
Upvotes: 1
Reputation: 31648
You can first add a column and then update
(hopefully your Hive supports it) all records with CASE
statement.
ALTER TABLE Data ADD COLUMNS (AgeRange Varchar);
UPDATE Data SET AgeRange = ( CASE WHEN Age < 18 THEN 0
WHEN Age >= 18 and <= 23 THEN 1
ELSE 2 END );
Upvotes: 1