priyanka
priyanka

Reputation: 325

Pig Error: Unhandled internal error. Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected

I have just upgraded Pig 0.12.0 to 0.13.0 version on Hortonworks HDP 2.1

I am getting below error when I am trying to use XMLLoader in my script, even though I have registered piggybank already.

Script:

 A = load 'EPAXMLDownload.xml' using org.apache.pig.piggybank.storage.XMLLoader('Document') as (x:chararray);

Error:

dump A
2014-08-10 23:08:56,494 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2014-08-10 23:08:56,496 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-08-10 23:08:56,651 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2014-08-10 23:08:56,727 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter], RULES_DISABLED=[FilterLogicExpressionSimplifier]}
2014-08-10 23:08:57,191 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2014-08-10 23:08:57,199 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-08-10 23:08:57,214 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-08-10 23:08:57,223 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2014-08-10 23:08:57,247 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected

Upvotes: 2

Views: 3261

Answers (4)

dinesh028
dinesh028

Reputation: 2187

Some times you may get problem after installing Pig like below:-

java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
 at org.apache.hcatalog.common.HCatUtil.checkJobContextIfRunningFromBackend(HCatUtil.java:88)
 at org.apache.hcatalog.pig.HCatLoader.setLocation(HCatLoader.java:162)
 at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:540)
 at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:322)
 at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:199)
 at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:277)
 at org.apache.pig.PigServer.launchPlan(PigServer.java:1367)
 at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1352)
 at org.apache.pig.PigServer.execute(PigServer.java:1341)

Many blogs suggest you to recompile the Pig by executing command:

ant clean jar-all -Dhadoopversion=23

or recompile piggybank.jar by executing below steps

cd contrib/piggybank/java

ant clean

ant -Dhadoopversion=23

But this may not solve your problem big time. The actual cause here is related to HCatalog. Try updating it!!. In my case, I was using Hive0.13 and Pig.0.13. And I was using HCatalog provided with Hive0.13.

Then I updated Pig to 0.15 and used separate hive-hcatalog-0.13.0.2.1.1.0-385 library jars. And problem was resolved....

Because later I identified it was not Pig who was creating problem rather it was Hive-HCatalog libraries. Hope this may help.

Upvotes: 1

WattsInABox
WattsInABox

Reputation: 4636

A few more details because the other answers didn't work for me:

  1. Git clone the pig git mirror https://github.com/apache/pig
  2. cd into the cloned directory
  3. if you've already built pig in the past in this directory, you should run a clean

    ant clean
    
  4. build pig for hadoop 2

    ant -Dhadoopversion=23
    
  5. cd into piggybank

    cd contrib/piggybank/java
    
  6. again, if you've build piggybank before, make sure to clean out the old build files

    ant clean
    
  7. build piggybank for hadoop 2 (same command, different directory)

    ant -Dhadoopversion=23
    

If you don't build pig first, piggybank will throw a bunch of "symbol not found" exceptions while compiling. In addition, since I had previously built pig for Hadoop 1 (accidentally), without running a clean, I ran into runtime errors.

Upvotes: 3

holdfenytolvaj
holdfenytolvaj

Reputation: 6097

Note that pig decides the hadoop version depending on which context var you have set HADOOP_HOME -> v1 HADOOP_PREFIX -> v2

If you use hadoop2, you need to recompile the piggybank (which is by default compiled for hadoop1)

  1. go to pig/contrib/piggybank/java
  2. $ ant -Dhadoopversion=23
  3. then copy that jar over pig/lib/piggybank.jar

Upvotes: 5

Abi
Abi

Reputation: 83

Even i faced the same error with Hadoop version 2.2.0. The work around is, we have to register following jar files using the grunt shell.

The paths that i am gonna paste below will be according to the hadoop-2.2.0 version. Kindly find the jars according to your version.

/hadoop-2.2.0/share/hadoop/mapreduce/ hadoop-mapreduce-client-core-2.2.0.jar

/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar

Using the REGISTER command we have to register these jars along with piggybank.

Run the pig script/command now and revert if you face any issue.

Upvotes: 0

Related Questions