Reputation: 325
I have just upgraded Pig 0.12.0 to 0.13.0 version on Hortonworks HDP 2.1
I am getting below error when I am trying to use XMLLoader in my script, even though I have registered piggybank already.
Script:
A = load 'EPAXMLDownload.xml' using org.apache.pig.piggybank.storage.XMLLoader('Document') as (x:chararray);
Error:
dump A
2014-08-10 23:08:56,494 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2014-08-10 23:08:56,496 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-08-10 23:08:56,651 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2014-08-10 23:08:56,727 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter], RULES_DISABLED=[FilterLogicExpressionSimplifier]}
2014-08-10 23:08:57,191 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2014-08-10 23:08:57,199 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-08-10 23:08:57,214 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-08-10 23:08:57,223 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2014-08-10 23:08:57,247 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
Upvotes: 2
Views: 3261
Reputation: 2187
Some times you may get problem after installing Pig like below:-
java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.apache.hcatalog.common.HCatUtil.checkJobContextIfRunningFromBackend(HCatUtil.java:88)
at org.apache.hcatalog.pig.HCatLoader.setLocation(HCatLoader.java:162)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:540)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:322)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:199)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:277)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1367)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1352)
at org.apache.pig.PigServer.execute(PigServer.java:1341)
Many blogs suggest you to recompile the Pig by executing command:
ant clean jar-all -Dhadoopversion=23
or recompile piggybank.jar by executing below steps
cd contrib/piggybank/java
ant clean
ant -Dhadoopversion=23
But this may not solve your problem big time. The actual cause here is related to HCatalog. Try updating it!!. In my case, I was using Hive0.13 and Pig.0.13. And I was using HCatalog provided with Hive0.13.
Then I updated Pig to 0.15 and used separate hive-hcatalog-0.13.0.2.1.1.0-385 library jars. And problem was resolved....
Because later I identified it was not Pig who was creating problem rather it was Hive-HCatalog libraries. Hope this may help.
Upvotes: 1
Reputation: 4636
A few more details because the other answers didn't work for me:
if you've already built pig in the past in this directory, you should run a clean
ant clean
build pig for hadoop 2
ant -Dhadoopversion=23
cd into piggybank
cd contrib/piggybank/java
again, if you've build piggybank before, make sure to clean out the old build files
ant clean
build piggybank for hadoop 2 (same command, different directory)
ant -Dhadoopversion=23
If you don't build pig first, piggybank will throw a bunch of "symbol not found" exceptions while compiling. In addition, since I had previously built pig for Hadoop 1 (accidentally), without running a clean, I ran into runtime errors.
Upvotes: 3
Reputation: 6097
Note that pig decides the hadoop version depending on which context var you have set HADOOP_HOME -> v1 HADOOP_PREFIX -> v2
If you use hadoop2, you need to recompile the piggybank (which is by default compiled for hadoop1)
Upvotes: 5
Reputation: 83
Even i faced the same error with Hadoop version 2.2.0. The work around is, we have to register following jar files using the grunt shell.
The paths that i am gonna paste below will be according to the hadoop-2.2.0 version. Kindly find the jars according to your version.
/hadoop-2.2.0/share/hadoop/mapreduce/ hadoop-mapreduce-client-core-2.2.0.jar
/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar
Using the REGISTER command we have to register these jars along with piggybank.
Run the pig script/command now and revert if you face any issue.
Upvotes: 0