Reputation: 703
I have the following data where id
is an Integer and vectors
is an array:
id, vectors
1, [1,2,3]
2, [2,3,4]
3, [3,4,5]
I would like to explode the vectors
column with its index postioning such that it looks like this:
+---+-----+------+
|id |index|vector|
+---+-----+------+
|1 |0 |1 |
|1 |1 |2 |
|1 |2 |3 |
|2 |0 |2 |
|2 |1 |3 |
|2 |2 |4 |
|3 |0 |3 |
|3 |1 |4 |
|3 |2 |5 |
+---+-----+------+
I figured that I can do this using Spark Scala using selectExpr
df.selectExpr("*", "posexplode(vectors) as (index, vector)")
However, this is a relatively simple task and I would like to avoid writing ETL scripts and was thinking if there is anyway the expression can be used and creating a view for easy access through Presto.
Upvotes: 0
Views: 1395
Reputation: 2032
You can use Lateral view
of Hive
to explode
array data.
Try below query -
select
id, (row_number() over (partition by id order by col)) -1 as `index`, col as vector
from (
select 1 as id, array(1,2,3) as vectors from (select '1') t1 union all
select 2 as id, array(2,3,4) as vectors from (select '1') t2 union all
select 3 as id, array(3,4,5) as vectors from (select '1') t3
) t
LATERAL VIEW explode(vectors) v;
Upvotes: 0
Reputation: 5316
This is easy to do in Presto using standard SQL syntax with UNNEST
:
WITH data(id, vector) AS (
VALUES
(1, array[1,2,3]),
(2, array[2,3,4]),
(3, array[3,4,5])
)
SELECT id, index - 1 AS index, value
FROM data, UNNEST(vector) WITH ORDINALITY AS t(value, index)
Note that the index produced by WITH ORDINALITY
is 1-based, so I subtracted 1 from it to produce the output you included in your question.
Upvotes: 4