Create Hive table from parquet files and load the data

Question

I am finding it difficult to load parquet files into hive tables. I am working on Amazon EMR cluster and spark for Data processing. But i need to read the output parquet files to validate my transformations. i have the parquet files with following schema:

root
 |-- ATTR_YEAR: long (nullable = true)
 |-- afil: struct (nullable = true)
 |    |-- clm: struct (nullable = true)
 |    |    |-- amb: struct (nullable = true)
 |    |    |    |-- L: string (nullable = true)
 |    |    |    |-- cdTransRsn: string (nullable = true)
 |    |    |    |-- dist: struct (nullable = true)
 |    |    |    |    |-- T: string (nullable = true)
 |    |    |    |    |-- content: double (nullable = true)
 |    |    |    |-- dscStrchPurp: string (nullable = true)
 |    |    |-- amt: struct (nullable = true)
 |    |    |    |-- L: string (nullable = true)
 |    |    |    |-- T: string (nullable = true)
 |    |    |    |-- content: double (nullable = true)
 |    |    |-- amtTotChrg: double (nullable = true)
 |    |    |-- cdAccState: string (nullable = true)
 |    |    |-- cdCause: string (nullable = true)

how can i create hive external table using this type of schema and load the parquet files into that hive table for analysis?

Create Hive table from parquet files and load the data

Answers (1)

Related Questions