Reputation: 389
I have a column A which elements consist of lists with multiple string elements (not arrays), such as:
**A**
1 'A123','B234','C123'
2 'A123','B234'
3 'A124',
The amount of list elements varies across records. The list elements however are always of the same length and are separated by comma.
I need to extract these string elements to a separate table/list B and get the distinct elements such as:
**B**
1 'A123'
2 'A124'
3 'B234'
4 'C123'
How can I do that using SQL? I'm using Impala or Hive I'm using Cloudera Hue for analysis.
Thanks,
KS
Upvotes: 1
Views: 940
Reputation: 1772
First use lateral view explode
to create a column with array elements, then dictinct, e.g.
SELECT distinct elem as col1
FROM tableA LATERAL VIEW explode(split(array_column, ',')) aTable AS elem
See Hive's manual for more details on lateral view
Upvotes: 3