Reputation: 609
Following is my code which runs pigrunner and pigstats:
String[] args = {"abc.pig"};
PigStats stats = PigRunner.run(args,null);
System.out.println("Stats : " + stats.getReturnCode());
OutputStats os = stats.result("B");
Iterator<Tuple> it = os.iterator();
while(it.hasNext()){
Tuple t = it.next();
System.out.println(t.getAll());
}
Contents of abc.pig
A = load 'Courses' using PigStorage(' ');
B = foreach A generate $0 as id;
dump B;
I get the correct output but it is followed by this exception Stacktrace with root cause
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost:54310/tmp/temp-221133443/tmp1478461116
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:154)
at org.apache.pig.impl.io.ReadToEndLoader.<init>(ReadToEndLoader.java:116)
at org.apache.pig.tools.pigstats.OutputStats.iterator(OutputStats.java:148)
at org.apache.jsp.result_jsp._jspService(result_jsp.java:86)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:419)
at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:391)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:334)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:304)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:240)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:164)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:462)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:164)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:562)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:395)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:250)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:188)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:166)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:302)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Now the same code works without an error if I replace the DUMP with a STORE.
Can some please explain me what is going on ?
Thanks Ravi
Upvotes: 1
Views: 759
Reputation: 10650
In case of dump Pig stores the output at a temporary location, e.g: hdfs://localhost/tmp/temp797130848/tmp1101984728
(have a look pig.map.output.dirs
in your job's config.xml)
PigRunner.run() calls GruntParser.processDump(String alias) at some point of the process which iterates through the result tuples and prints them out to the console:
Iterator<Tuple> result = mPigServer.openIterator(alias);
while (result.hasNext())
{
Tuple t = result.next();
System.out.println(TupleFormat.format(t));
}
After this, but before returning, it also calls FileLocalizer.deleteTempFiles() which deletes this temporary directory.
Now you want to return the result of alias B.
OutputStats's iterator tries to open the temporary file again to loop over the tuples as PigRunner.run()
did it before.
But the problem is that this file doesn't exist anymore, therefore your get the exception.
So I'd suggest you to remove the code after System.out.println("Stats : " + stats.getReturnCode());
since you already have the dump printed out.
Upvotes: 3