Mohit
Mohit

Reputation: 11

How to split semicolon separated data to columns in hive and also transpose it?

I have a dataset in the following format in hive table.

<br><b>|Col_1&nbsp;| Col_2&nbsp; &nbsp;|Col_3</b></<br>
<br>|abc_1&nbsp;|&nbsp;a;b;c;d&nbsp;&nbsp;|m;n</br>
<br>|abc_2  | e;f;d &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|l;h;p<br>

I need to convert the data into the following format.

<br><b>|Col_1&nbsp;|&nbsp;Col_2_OR_3&nbsp;|Value</b></<br>
<br>|abc_1&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;Col_2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;a</br>
<br>|abc_1&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;Col_2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;b</br>
<br>|abc_1&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;Col_2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;c</br>
<br>|abc_1&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;Col_2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;d</br>
<br>|abc_1&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;Col_3&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;m</br>
<br>|abc_1&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;Col_3&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;n</br>
<br>|abc_2&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;Col_2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;e</br>
<br>|abc_2&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;Col_2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;f</br>
<br>|abc_2&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;Col_2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;d</br>
<br>|abc_2&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;Col_3&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;l</br>
<br>|abc_2&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;Col_3&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;h</br>
<br>|abc_2&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;Col_3&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;p</br>

The data initially is unique at values in Col_1.The semicolon separated values in Col_2 and Col_3 need to be separated and transposed as a new column Value and another column needs to be created Col_2_OR_3 which has the values of column header for the column from which the semicolon separated value is picked from Col_2 or Col_3.

Upvotes: 0

Views: 1156

Answers (1)

Maddy RS
Maddy RS

Reputation: 1031

select col_1,'Col_2',col_2_al
from <table_name>
lateral view explode(split(col_2, "\\\;")) col_2_al as col_2_al 
union all
select col_1,'Col_3',col_2_al
from <table_name>
lateral view explode(split(col_3,"\\\;")) col_3_al as col_2_al

Upvotes: 1

Related Questions