Surender Raja
Surender Raja

Reputation: 3609

Reading a CSV File in Pig

I am using Cloudera CDH3 Pseudo mode Cluster. In CDH3 The Pig Version is 0.8

I would like to read a CSV or Excel File Using Pig script

I downloaded piggybank-0.11.0.jar and kept it inside /home/cloudera/ directory

my csv file is like this..

id    name       city
100   surrender  Chennai
101   raja       Chennai

My Pig script is below

REGISTER '/home/cloudera/piggybank-0.11.0.jar';

A = LOAD '/user/cloudera/inputfiles/sample_rec.csv' USING CSVExcelStorage(',') AS (id:int,name:chararray,city:chararray);
B = DUMP A;

But I am getting below error

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve CSVExcelStorage using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.

Do I need to download piggbank jar for pig 0.8 version?

What is wrong here? Is it possible to read csv file in pig 0.8 version?

Upvotes: 1

Views: 2438

Answers (1)

Murali Rao
Murali Rao

Reputation: 2287

Specify complete package name while using CSVExcelStorage() :

USING org.apache.pig.piggybank.storage.CSVExcelStorage() AS ...

Other Checks :

  1. Unjar and see if you are having CSVExcelStorage class.

  2. "," is the default delimiter for CSVExcelStorage, we need not specify the same.

Other alternative is to make use of CSVLoader

 A = LOAD 'a.csv' USING org.apache.pig.piggybank.storage.CSVLoader() AS (f1,f2,f3);

Ref : http://pig.apache.org/docs/r0.8.1/api/org/apache/pig/piggybank/storage/CSVLoader.html

Upvotes: 2

Related Questions