Niyas
Niyas

Reputation: 515

LOAD csv file in PigLatin

I'm trying to load a csv file in PigLatin. Record format is as follows: "ABBOTT,DEEDEE W",GRADES 9-12 TEACHER,"52,122.10",0,LBOE,ATLANTA INDEPENDENT SCHOOL SYSTEM,2010

I tried the following code:

A = LOAD '/user/hduser/salaryTravel.csv' using PigStorage(',')  AS (name:chararray,job:chararray,salary:float,TA:float,type:chararray,org:chararray,year:int);

But the output is as follows:

("ABBOTT,DEEDEE W",,,122.10",0,)

The name field is read as separate fields since the name field contains a comma(','). How can I read this record?

Upvotes: 1

Views: 1189

Answers (1)

Murali Rao
Murali Rao

Reputation: 2287

Would suggest to use CSVExcelStorage or CSVLoader API for loading the data.

REGISTER piggybank.jar;

A = LOAD '/user/hduser/salaryTravel.csv' using org.apache.pig.piggybank.storage.CSVExcelStorage()  AS (name:chararray,job:chararray,salary:float,TA:float,type:chararray,org:chararray,year:int);

or

REGISTER piggybank.jar;

A = LOAD '/user/hduser/salaryTravel.csv' using org.apache.pig.piggybank.storage. CSVLoader()  AS (name:chararray,job:chararray,salary:float,TA:float,type:chararray,org:chararray,year:int);

Ref : REGEX_EXTRACT error in PIG, have shared few code samples.

Upvotes: 2

Related Questions