Reputation: 96
My use case is as following:
I want to copy data from Table-A to Table-B and transform field1 from array of struct to array of string where the string is the val1 property of struct in table-A and ignore val2.
Table-A:
field1: array<struct<val1: str, val2: int>>
sample data:
[{val1: "abc", val2: 123}, {val1: "def", val2: 456}], [{val1: "xyz", val2: 789}]
Table-B:
field1: array<string>
sample data:
["abc", "def"], ["xyz"]
I am not able to figure out how I can select the field1 column with transformation on it through hive query language.
The things I was able to figure out was I could explode the array, perform a select of the val1 and then do collect_list but after trying multiple times could not get the correct syntax.
My query was like:
select collect_list(select col.val1
from explode(field1) as col) from table-A
I also want to strictly do it through hiveql and not through a udf in python.
Thank you.
Upvotes: 0
Views: 5715
Reputation: 21
There is some magic in the handling of arrays that allow you to do this:
select t.myarray as original_array, t.myarray.val1 from mydata t
I.e. selecting struct field val1
from an array of structs will return an array of val1
From http://thornydev.blogspot.com/2013/07/querying-json-records-via-hive.html
Upvotes: 0
Reputation: 38325
Explode original array using Lateral View + explode and collect array of struct.val1 using collect_set() or collect_list():
with mydata as (--This is your data example, use your table instead of this CTE
select stack (2,
array(named_struct("val1", "abc", "val2", 123), named_struct("val1", "def", "val2", 456)),
array(named_struct("val1", "xyz", "val2", 789))
) as myarray
)
select t.myarray as original_array, collect_set(s.val1) as result_array
from mydata t
lateral view explode(myarray) e as s --struct
group by t.myarray
Result:
original_array result_array
[{"val1":"abc","val2":123},{"val1":"def","val2":456}] ["abc","def"]
[{"val1":"xyz","val2":789}] ["xyz"]
Your structure can be also declared as map, not struct. In this case use s['val1']
instead of s.val1
to get map element.
Upvotes: 1