tuhin
tuhin

Reputation: 1

1003 error (unable to find an operator for alias ) in group function in pig

I have written a .pig file whose content is :

register /home/tuhin/Documents/PigWork/pigdata/piggybank.jar;
define replace org.apache.pig.piggybank.evaluation.string.REPLACE();
define csvloader org.apache.pig.piggybank.storage.CSVLoader();
xyz = load '/pigdata/salaryTravelReport.csv' using csvloader();
x = foreach xyz generate $0 as name:chararray, $1 as title:chararray, replace($2, ',','')  as salary:bytearray, replace($3, ',', '') as travel:bytearray, $4 as orgtype:chararray, $5 as org:chararray, $6 as year:bytearray;
refined = foreach x generate name, title, (float)salary, (float)travel, orgtype, org, (int)year;
year2010 = filter refined by year == 2010;
byjobtitile = GROUP year2010 by title;

The purpose is to remove ',' in dollar value in 2 columns and then group the data by jobtitle. When I am running this using run command there is not error. Even dumping of year2010 is working fine. But dumping byjobtitiel is giving error:

error in dumping

The output of the log file is:

Pig Stack Trace --------------- ERROR 1003: Unable to find an operator for alias byjobtitle

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1003: Unable to find an operator for alias byjobtitle at org.apache.pig.PigServer$Graph.buildPlan(PigServer.java:1544) at org.apache.pig.PigServer.storeEx(PigServer.java:1029) at org.apache.pig.PigServer.store(PigServer.java:997) at org.apache.pig.PigServer.openIterator(PigServer.java:910) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:754) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:376) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66) at org.apache.pig.Main.run(Main.java:565) at org.apache.pig.Main.main(Main.java:177) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

I am new to bigdata and dont have much knowledge. But it looks like there is a problem in data type. Can anyone help me out?

Upvotes: 0

Views: 1814

Answers (1)

Rakesh Shah
Rakesh Shah

Reputation: 46

The issue is due to "CSVLoader" you are using. This will have ',' as default delimiter. Since your data also has "," in some of its field like salary and travel, the positional index is getting changed. So if your data is something like this

name title salary travel orgtype org year
A B 10,000 23,1357 ORG_TYPE ORG 2016

then using CSVLoader will make "A B 10" as the first field, "000 23" as the second field and "1357 ORG_TYPE ORG 2016" as the third field based on ","

register /Users/rakesh/Documents/SVN/iReporter/iReporterJobFramework/avro/lib/1.7.5/piggybank.jar;
define replace org.apache.pig.piggybank.evaluation.string.REPLACE();
define csvloader org.apache.pig.piggybank.storage.CSVLoader();
xyz = load '<path to your file>' using csvloader();
a = foreach xyz generate $0;


2016-06-07 12:28:12,384 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1<br>
(A  B   10)<br>

You can make your delimiter different so that it is not present in any field value.

Try using CSVExcelStorage. You can use its constructor to explicitly define the delimiter

register /Users/rakesh/Documents/SVN/iReporter/iReporterJobFramework/avro/lib/1.7.5/piggybank.jar;
define replace org.apache.pig.piggybank.evaluation.string.REPLACE();
define CSVExcelStorage org.apache.pig.piggybank.storage.CSVExcelStorage('|','NO_MULTILINE','NOCHANGE');

It will work fine as long as same identifier is not present as ;

  • delimiter
  • any field value

Upvotes: 1

Related Questions