Freshguy12
Freshguy12

Reputation: 3

parquet in Hive

I am learning how to use Hive with the Hortonwork Sandbox, however whenever I am not able to. When I create a table, the data is not shown, so I decided to add this query:

Create external table tripinfo (
  VendorID string,
  pickup string,
  dropoff string,
  Passenger string,
  distance string,
  Pickloc string,
  droploc string,
  rate string,
  store string,
  payment string,
  amount string,
  extra string,
  tax string,
  improvement string,
  tip string,
  tolls string,
  tap string)
row format serde "parquet.hive.serde.PaquetHiveSerDe"
stored as 
INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat"
OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat"
Location "/user/taxi/yellow data/trip/";

however, it shows this error: Error while compiling statement: FAILED: SemanticException Cannot find class 'parquet.hive.DeprecatedParquetInputFormat'

the parquet file is already in HDFS, separated by " " and is huge (as you might have expected) Am I doing something wrong, or is ther any way to create a table with parquet data?

Upvotes: 0

Views: 1010

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191681

I assume you are reading this page? - https://cwiki.apache.org/confluence/display/Hive/Parquet

Notice the header Hive 0.10-0.12. The sandbox should at least be using Hive 1.x, maybe even 2.x, so you should just use a query like so

CREATE EXTERNAL TABLE name (
...
) 
STORED AS PARQUET
LOCATION "___";

Binary data in parquet files shouldn't separated by ASCII spaces unless you are referring to a single string-type column.

Upvotes: 0

Related Questions