Reputation: 3609
I am using Cloudera CDH3 Pseudo mode Cluster. In CDH3 The Pig Version is 0.8
I would like to read a CSV or Excel File Using Pig script
I downloaded piggybank-0.11.0.jar and kept it inside /home/cloudera/ directory
my csv file is like this..
id name city
100 surrender Chennai
101 raja Chennai
My Pig script is below
REGISTER '/home/cloudera/piggybank-0.11.0.jar';
A = LOAD '/user/cloudera/inputfiles/sample_rec.csv' USING CSVExcelStorage(',') AS (id:int,name:chararray,city:chararray);
B = DUMP A;
But I am getting below error
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve CSVExcelStorage using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.
Do I need to download piggbank jar for pig 0.8 version?
What is wrong here? Is it possible to read csv file in pig 0.8 version?
Upvotes: 1
Views: 2438
Reputation: 2287
Specify complete package name while using CSVExcelStorage() :
USING org.apache.pig.piggybank.storage.CSVExcelStorage() AS ...
Other Checks :
Unjar and see if you are having CSVExcelStorage class.
"," is the default delimiter for CSVExcelStorage, we need not specify the same.
Other alternative is to make use of CSVLoader
A = LOAD 'a.csv' USING org.apache.pig.piggybank.storage.CSVLoader() AS (f1,f2,f3);
Ref : http://pig.apache.org/docs/r0.8.1/api/org/apache/pig/piggybank/storage/CSVLoader.html
Upvotes: 2