Reputation: 699
How to create multiple columns from an existing hive table. The example data would be like below.
My requirement is to create 2 new columns from existing table only when the condition met. col1 when code=1. col2 when code=2.
expected output:
Please help in how to achieve it in Hive queries?
Upvotes: 1
Views: 427
Reputation: 38290
If you aggregate values required into arrays, then you can explode and filter only those with matching positions.
Demo:
with
my_table as (--use your table instead of this CTE
select stack(8,
'a',1,
'b',2,
'c',3,
'b1',2,
'd',4,
'c1',3,
'a1',1,
'd1',4
) as (col, code)
)
select c1.val as col1, c2.val as col2 from
(
select collect_set(case when code=1 then col else null end) as col1,
collect_set(case when code=2 then col else null end) as col2
from my_table where code in (1,2)
)s lateral view outer posexplode(col1) c1 as pos, val
lateral view outer posexplode(col2) c2 as pos, val
where c1.pos=c2.pos
Result:
col1 col2
a b
a1 b1
This approach will not work if arrays are of different size.
Another approach - calculate row_number and full join on row_number, this will work if col1 and col2 have different number of values (some values will be null):
with
my_table as (--use your table instead of this CTE
select stack(8,
'a',1,
'b',2,
'c',3,
'b1',2,
'd',4,
'c1',3,
'a1',1,
'd1',4
) as (col, code)
),
ordered as
(
select code, col, row_number() over(partition by code order by col) rn
from my_table where code in (1,2)
)
select c1.col as col1, c2.col as col2
from (select * from ordered where code=1) c1
full join
(select * from ordered where code=2) c2 on c1.rn = c2.rn
Result:
col1 col2
a b
a1 b1
Upvotes: 1