frazman
frazman

Reputation: 33213

pivot from multiple rows to multiple columns in hive

I have a hive table like following

(id:int, vals: Map<String, int> , type: string)
id, vals, type
1, {"foo": 1}, "a"
1, {"foo": 2}, "b"
2, {"foo": 3}, "a"
2, {"foo": 1}, "b"

Now, there are only two types I want to change this to following schema

id, type_a_vals, type_b_vals
1, {"foo", 1}, {"foo": 2}
2, {"foo": 3}, {"foo": 1}

and if any "type" is missing, it can be null?

Upvotes: 0

Views: 135

Answers (1)

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49260

An easy way keeping in mind the map column would be a self join.

select ta.id,ta.vals,tb.vals
from (select * from tbl where type = 'a') ta 
full join (select * from tbl where type = 'b') tb on ta.id = tb.id 

You can use conditional aggregation to solve questions like these as below. However, doing so on a map column would produce an error.

select id
      ,max(case when type = 'a' then vals end) as type_a_vals
      ,max(case when type = 'b' then vals end) as type_a_vals
from tbl
group by id

Upvotes: 1

Related Questions