Sumit Khanna
Sumit Khanna

Reputation: 3

select from parquet table returns nothing in hive

I just followed this to create a simple parquet file.

Scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
Scala> val employee = sqlContext.read.json(“employee”)
Scala> employee.write.parquet(“employee.parquet”)

the parquet file gets created and is fine.

and then, I create a hive external table providing this employee.parquet as my location . please note it is normal file sys path no s3:// or hdfs.

this is my hive table create query :

create external table employees (a String, b String, c Int) stored as PARQUET location '/Users/Sumit/Documents/Repos/misc_codes/employees.parquet';

it says OK, meaning table is created. it even shows up in show tables;

but when I do :

select * from employees;

it returns nothing, just an OK. I do believe I had 3 records in my employee.json like this :

{"age": 50, "name": "adi", "title": "sir"}
{"age": 60, "name": "jyoti", "title": "mam"}
{"age": 14, "name": "sumit", "title": "baalak"}

and it is getting successfully generated as a parquet file, where did I go wrong ?

Thanks,

Upvotes: 0

Views: 963

Answers (1)

PSB
PSB

Reputation: 41

Column names in your hive table should match with the names in json file though the order of columns doesn't matter-

create external table employees (name String, title String, age Int) stored as PARQUET location '/Users/Sumit/Documents/Repos/misc_codes/employees.parquet';

Upvotes: 1

Related Questions