Reputation: 1604
Hive's function explode is documented here It is essentially a very practical function that generates many rows from a single one. Its basic version takes a column whose value is an array of values and produces a copy of the same row for each of those values.
I wonder whether such a thing exists in Impala. I haven't been able to find it in the documentation.
Upvotes: 2
Views: 7751
Reputation: 21
Lateral views. In CDH 5.5 / Impala 2.3 and higher, Impala supports queries on complex types (STRUCT, ARRAY, or MAP), using join notation rather than the EXPLODE() keyword. See Complex Types (CDH 5.5 or higher only) for details about Impala support for complex types.
use another table B that unfold from array of values. and then use table a inner join table b.
See Impala Doc https://docs.cloudera.com/documentation/enterprise/5-9-x/topics/impala_langref_unsupported.html
Upvotes: 1
Reputation: 520
The very very tricky approach:
with A as (select 'row 1' as key, 'a;b;c' as value
union all
select 'row 2' as key, 'd;e' as value
union all
select 'row 3' as key, 'f' as value),
B as (select *, length(value) - length(regexp_replace(value,';','')) + 1 as n from A),
-- assuming you have at lest as many rows as different values in a single row
C as (select row_number() over(order by key) as seq, n from B),
D as (select seq from C where seq <= (select max(n) from C))
select key, value, split_part(value,';',seq) as part
from B
cross join D
where seq <= n
order by key,seq
Important: notice the comment "assuming you have at least as many rows as different values in a single row".
Just run it and see the result.
Upvotes: 2
Reputation: 354
Impala does not have any function like EXPLODE in hive to read complex data types and generate multiple rows.
Currently through Impala, we can just read the complex data types in Hive generated tables using dot notation like select employee.empid from table1
.
Impala can query complex type columns only from Parquet tables or Parquet partitions within partitioned tables
Upvotes: 2