lydias
lydias

Reputation: 841

Hive: Ways to transpose in Hive sql

I'm trying to transpose dataset in Hive that is structured as of the following:

Id1  Id2 Event
 1    1   7
 2    2   3
 2    2   7
 3    3   8
 3    3   7
 1    2   3
 1    2   7

Some id combos have many events (close to 20ish unique events) and I need to transpose them into 20 columns for each unique combination of Id1 and Id2 such as:

Id1 Id2 event1  event2  event3 event4 event5.......event20
1    1    7       
2    2    3        7
3    3    8        7
1    2    3        7

If it is possible I would also like to know how to transpose in the following form without using 20 max() function: (Here the event value will be the suffix and every occurrence will be counted as 1)

Id1 Id2 event_7 event_3  event_8 ........
1    1    1       
2    2    1        1
3    3    1                1
1    2    1        1

Thank you so much!

Upvotes: 0

Views: 165

Answers (1)

user3327034
user3327034

Reputation: 405

Can you try this and see if this works? What I'm doing is, first rank the data of id1 so the rank will be 1,2 for the duplicate values of Id1. Then use the rank and concatenate with 'event_' to form something like 'event_1', 'event_2' and so on.

There are 2 options below. 1) is using named_struct 2) using to_map I haven't tried this so there might be some syntax issue, but hope you get the idea.

with data as (
    select 
        id1, id2, event, 
        row_number() over (partition by id1 order by id1) as rnk
    from table
    ),
collect_data as (   
    select id1, id2, collect_set(named_struct(concat("event_', rnk), event)) kv
    from data
    group by id1,id2
    )
select id1, id2, kv[0]['event_1'], kv[0]['event_2']...
from collect_data;

OR

with data as (
    select 
        id1, id2, event, 
        row_number() over (partition by id1 order by id1) as rnk
    from table
    ),
collect_data as (   
select id1, id2, to_map(concat('event_',rnk), event) as kv
from data
group by id1, id2
    )
select id1, id2, 
  kv['c1'] AS c1,
  kv['c2'] AS c2,
  kv['c3'] AS c3    
from collect_data;

Upvotes: 1

Related Questions