John Constantine
John Constantine

Reputation: 1092

Concatenation in hive query

I have a hive table like

col1   col2
1     ["apple", "orange"]
1     ["orange", "banana"]
1     ["mango"]
2     ["apple"]
2     ["apple", "orange"]

There data types is

col1 int
col2 array<string>

I want to query something like :

select col1, concat(col2) from table group by col1;

Output should be :

1    ["apple", "orange", "banana", "mango"]
2    ["apple", "orange"]

Is there any function in hive to do this ?

Also I write this data to csv and when I read it as a dataframe I get the col2 dtype as object. Is there a way to output it as an array.

Upvotes: 1

Views: 104

Answers (1)

notNull
notNull

Reputation: 31470

Try by exploding the array then use collect_set function by grouping by col1.

Example:

Input:

select * from table;
OK
dd.col1 dd.col2
1       ["apple","orange"]
1       ["mango"]
1       ["orange","banana"]

select col1,collect_set(tt1)col2 from (
   select * from table lateral view explode(col2) tt as tt1
)cc 
group by col1;

Output:

col1    col2
1       ["apple","orange","mango","banana"]

Upvotes: 1

Related Questions