Reputation: 165
I'm trying to read a parquet file with Impala.
impala-shell> SELECT * FROM `/path/in/hdfs/*.parquet`
I know I can do that using Spark or Drill, but I wonder if it's possible with Impala ?
Thanks
Upvotes: 1
Views: 1801
Reputation: 2767
You would need to create a structured table
on top of the parquet
files to query via Impala.
General example of external table
pointing to parquet directory ... Cloudera docs provide all methods here:
https://www.cloudera.com/documentation/enterprise/latest/topics/impala_parquet.html#parquet_ddl
CREATE EXTERNAL TABLE ingest_existing_files LIKE PARQUET '/user/etl/destination/datafile1.dat'
STORED AS PARQUET
LOCATION '/user/etl/destination';
Upvotes: 2