Aryan Singh
Aryan Singh

Reputation: 611

Can we use bucketing in hive table backed by avro schema

I am trying to create one hive table backed by avro schema. Below is the DDL for that

CREATE TABLE avro_table
ROW FORMAT 
  SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'    
CLUSTERED BY (col_name) INTO N BUCKETS    
STORED AS 
  INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' 
  OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'  
TBLPROPERTIES ( 'avro.schema.url' = 'hdfs://sandbox.hortonworks.com:8020/avroschema/test_schema.avsc')

But it is throwing below mentioned error

FAILED: ParseException line 3:3 missing EOF at 'clustered' near ''org.apache.hadoop.hive.serde2.avro.AvroSerDe''

I am not sure wheather we can use bucketing in Hive backed by AVRO or not

hive version--1.2

Can any one help me or provide any idea to achieve this .....

Upvotes: 1

Views: 378

Answers (1)

Tom Harrison
Tom Harrison

Reputation: 14018

Your syntax is in the wrong order, and missing stuff. ROW FORMAT is defined after CLUSTERED BY, and CLUSTERED BY requires a column name which presumably needs to be defined as part of the CREATE TABLE command.

I assume the N in N BUCKETS is really replaced with your actual number of buckets, but if not, that's another error.

I have formatted the query in your question so that I could read it, and comparing to syntax here it was easier to spot what the parser didn't like.

Upvotes: 1

Related Questions