rp346
rp346

Reputation: 7028

combine avro files into one

I want to combine small avro files into one avro file, keeping the same schema, using pig.

I tried to do this:

REGISTER avro-1.7.2.jar

a = load '$SOURCE' using org.apache.pig.piggybank.storage.avro.AvroStorage ();
store a into '$TARGET' using org.apache.pig.piggybank.storage.avro.AvroStorage (); 

but failed with following error:

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve org.apache.pig.piggybank.storage.avro.AvroStorage using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

How do I combine small avro files into one file using pig ?

Upvotes: 0

Views: 3571

Answers (1)

Mikko Kupsu
Mikko Kupsu

Reputation: 371

Firstly, AvroStorage is part on piggybank, so you need also to register piggybank.jar.

REGISTER piggybank.jar

Secondly, using AvroStorage request additional libraries so you need to register json-simple-1.1.1.jar.

REGISTER json-simple-1.1.1.jar

Thirdly, if you want use more recent version of Avro, you need avro-mapred.jar

I have following code in my Pig scripts:

REGISTER lib/piggybank-0.13.0.jar;
REGISTER lib/avro-1.7.7.jar;
REGISTER lib/avro-mapred-1.7.7.jar;
REGISTER lib/json-simple-1.1.1.jar;

Upvotes: 1

Related Questions