Reputation: 1715
I was trying to expose nutch using REST endpoints and ran into an issue in indexer phase. I'm using elasticsearch index writer to index docs to ES. I've used $NUTCH_HOME/runtime/deploy/bin/nutch startserver command. While indexing an unknown exception is thrown.
Error: com.google.common.util.concurrent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor; 16/10/07 16:01:47 INFO mapreduce.Job: map 100% reduce 0% 16/10/07 16:01:49 INFO mapreduce.Job: Task Id : attempt_1475748314769_0107_r_000000_1, Status : FAILED Error: com.google.common.util.concurrent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor; 16/10/07 16:01:53 INFO mapreduce.Job: Task Id : attempt_1475748314769_0107_r_000000_2, Status : FAILED Error: com.google.common.util.concurrent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor; 16/10/07 16:01:58 INFO mapreduce.Job: map 100% reduce 100% 16/10/07 16:01:59 INFO mapreduce.Job: Job job_1475748314769_0107 failed with state FAILED due to: Task failed task_1475748314769_0107_r_000000 Job failed as tasks failed. failedMaps:0 failedReduces:1
ERROR indexer.IndexingJob: Indexer: java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865) at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145) at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)
Failed with exit code 255.
Any help would be appreciated.
PS : After debugging using stack trace I think the issue is due to mismatch in guava version. I've tried changing build.xml of plugins(parse-tika and parsefilter-naivebayes) but it didn't work.
Upvotes: 3
Views: 348
Reputation: 1715
I have found solution for this issue. This is due to the version compatibility of guava dependency. Hadoop uses guava-11.0.2.jar as dependency. But the elastic indexer plugin in nutch requires 18.0 version of guava. That's why it is throwing an exception when trying to run in distributed hadoop. So we just need to update guava version to 18.0 in hadoop libs(can be found at $HADOOP_HOME/share/hadoop/common/libs/).
Upvotes: 2