Reputation: 45
I am loading a file to PigStorage. The file has a column Newvalue
, a free text column which includes commas in it. When I specify comma as delimiter this gives me problem. I am using following code.
inpt = load '/home/cd36630/CRM/1monthSample.txt' USING PigStorage(',')
AS (BusCom:chararray,Operation:chararray,OperationDate:chararray,
ISA:chararray,User:chararray,Field:chararray,Oldvalue:chararray,
Newvalue:chararray,RecordId:chararray);
Any help is appreciated.
Upvotes: 0
Views: 251
Reputation: 5186
If the input is in csv form then you can use CSVLoader
to load it. This may fix your issue.
If this doesn't work then you can load into a single chararray and then write a UDF to split the total line in a way that respects the spaces in Newvalue
. EG:
register 'myudfs.py' using jython as myudfs ;
A = LOAD '/home/cd36630/CRM/1monthSample.txt' AS (total:chararray) ;
B = FOREACH A GENERATE myudf.prepare_input(total) ;
Upvotes: 1