Reputation: 841
I'm trying to transpose dataset in Hive that is structured as of the following:
Id1 Id2 Event
1 1 7
2 2 3
2 2 7
3 3 8
3 3 7
1 2 3
1 2 7
Some id combos have many events (close to 20ish unique events) and I need to transpose them into 20 columns for each unique combination of Id1 and Id2 such as:
Id1 Id2 event1 event2 event3 event4 event5.......event20
1 1 7
2 2 3 7
3 3 8 7
1 2 3 7
If it is possible I would also like to know how to transpose in the following form without using 20 max() function: (Here the event value will be the suffix and every occurrence will be counted as 1)
Id1 Id2 event_7 event_3 event_8 ........
1 1 1
2 2 1 1
3 3 1 1
1 2 1 1
Thank you so much!
Upvotes: 0
Views: 165
Reputation: 405
Can you try this and see if this works? What I'm doing is, first rank the data of id1 so the rank will be 1,2 for the duplicate values of Id1. Then use the rank and concatenate with 'event_' to form something like 'event_1', 'event_2' and so on.
There are 2 options below. 1) is using named_struct 2) using to_map I haven't tried this so there might be some syntax issue, but hope you get the idea.
with data as (
select
id1, id2, event,
row_number() over (partition by id1 order by id1) as rnk
from table
),
collect_data as (
select id1, id2, collect_set(named_struct(concat("event_', rnk), event)) kv
from data
group by id1,id2
)
select id1, id2, kv[0]['event_1'], kv[0]['event_2']...
from collect_data;
OR
with data as (
select
id1, id2, event,
row_number() over (partition by id1 order by id1) as rnk
from table
),
collect_data as (
select id1, id2, to_map(concat('event_',rnk), event) as kv
from data
group by id1, id2
)
select id1, id2,
kv['c1'] AS c1,
kv['c2'] AS c2,
kv['c3'] AS c3
from collect_data;
Upvotes: 1