Harinder
Harinder

Reputation: 11944

Error running hadoop(Cloudera-2.0.0-cdh4.4.0) job from eclipse?

Hi i am running the hadoop wordcount example from eclipse and i am getting the following error:-

13/11/24 22:17:08 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder sending #12
13/11/24 22:17:08 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder got value #12
13/11/24 22:17:08 DEBUG ipc.ProtobufRpcEngine: Call: delete took 11ms
13/11/24 22:17:08 WARN mapred.LocalJobRunner: job_local1690217234_0001
java.lang.AbstractMethodError
    at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:96)
    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54)
    at java.io.DataOutputStream.write(DataOutputStream.java:90)
    at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.writeObject(TextOutputFormat.java:76)
    at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.write(TextOutputFormat.java:91)
    at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:483)
    at org.hadoop.par.WordCount$Reduce.reduce(WordCount.java:34)
    at org.hadoop.par.WordCount$Reduce.reduce(WordCount.java:1)
    at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
13/11/24 22:17:09 INFO mapred.JobClient:  map 100% reduce 0%
13/11/24 22:17:09 INFO mapred.JobClient: Job complete: job_local1690217234_0001
13/11/24 22:17:09 INFO mapred.JobClient: Counters: 26
13/11/24 22:17:09 INFO mapred.JobClient:   File System Counters
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of bytes read=172
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of bytes written=91974
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of read operations=0
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of large read operations=0
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of write operations=0
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of bytes read=91
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of bytes written=0
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of read operations=5
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of large read operations=0
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of write operations=1
13/11/24 22:17:09 INFO mapred.JobClient:   Map-Reduce Framework
13/11/24 22:17:09 INFO mapred.JobClient:     Map input records=15
13/11/24 22:17:09 INFO mapred.JobClient:     Map output records=17
13/11/24 22:17:09 INFO mapred.JobClient:     Map output bytes=152
13/11/24 22:17:09 INFO mapred.JobClient:     Input split bytes=112
13/11/24 22:17:09 INFO mapred.JobClient:     Combine input records=17
13/11/24 22:17:09 INFO mapred.JobClient:     Combine output records=13
13/11/24 22:17:09 INFO mapred.JobClient:     Reduce input groups=1
13/11/24 22:17:09 INFO mapred.JobClient:     Reduce shuffle bytes=0
13/11/24 22:17:09 INFO mapred.JobClient:     Reduce input records=1
13/11/24 22:17:09 INFO mapred.JobClient:     Reduce output records=0
13/11/24 22:17:09 INFO mapred.JobClient:     Spilled Records=13
13/11/24 22:17:09 INFO mapred.JobClient:     CPU time spent (ms)=0
13/11/24 22:17:09 INFO mapred.JobClient:     Physical memory (bytes) snapshot=0
13/11/24 22:17:09 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
13/11/24 22:17:09 INFO mapred.JobClient:     Total committed heap usage (bytes)=138477568
13/11/24 22:17:09 INFO mapred.JobClient:   org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
13/11/24 22:17:09 INFO mapred.JobClient:     BYTES_READ=91
13/11/24 22:17:09 INFO mapred.JobClient: Job Failed: NA
Exception in thread "main" java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1372)
    at org.hadoop.par.WordCount.main(WordCount.java:62)
13/11/24 22:17:09 DEBUG hdfs.DFSClient: Waiting for ack for: -1
13/11/24 22:17:09 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder sending #13
13/11/24 22:17:09 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder got value #13
13/11/24 22:17:09 ERROR hdfs.DFSClient: Failed to close file /user/harinder/test_output/_temporary/_attempt_local1690217234_0001_r_000000_0/part-00000
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /user/harinder/test_output/_temporary/_attempt_local1690217234_0001_r_000000_0/part-00000: File does not exist. Holder DFSClient_NONMAPREDUCE_-559950586_1 does not have any open files.
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2445)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2437)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:2503)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2480)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:556)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:337)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44958)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745)

    at org.apache.hadoop.ipc.Client.call(Client.java:1237)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    at com.sun.proxy.$Proxy9.complete(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
    at com.sun.proxy.$Proxy9.complete(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:329)
    at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:1769)
    at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1756)
    at org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:696)
    at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:713)
    at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:559)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2399)
    at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2415)
    at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
13/11/24 22:17:09 DEBUG ipc.Client: Stopping client
13/11/24 22:17:09 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder: closed
13/11/24 22:17:09 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder: stopped, remaining connections 0

I have added all the required JAR's in my project. Following is the code i am running:-

import java.io.IOException;
import java.util.*;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;

public class WordCount {

  public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
      String line = value.toString();
      StringTokenizer tokenizer = new StringTokenizer(line);
      while (tokenizer.hasMoreTokens()) {
        word.set(tokenizer.nextToken());
        output.collect(word, one);
      }
    }
  }

  public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {
    public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
      int sum = 0;
      while (values.hasNext()) {
        sum += values.next().get();
      }
      output.collect(key, new IntWritable(sum));
    }
  }

  public static void main(String[] args) throws Exception {
    JobConf conf = new JobConf(WordCount.class);

    conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
    conf.addResource(new Path("/etc/hadoop/conf/hdfs-site.xml"));

    conf.setJobName("wordcount");

    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapperClass(Map.class);
    conf.setCombinerClass(Reduce.class);
    conf.setReducerClass(Reduce.class);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);


    FileInputFormat.setInputPaths(conf, new Path("/user/harinder/test_data/"));
    FileOutputFormat.setOutputPath(conf, new Path("/user/harinder/test_output"));
    //FileInputFormat.setInputPaths(conf, new Path(args[0]));
    //FileOutputFormat.setOutputPath(conf, new Path(args[1]));

    JobClient.runJob(conf);
  }
}

Following are the jar files added to the project:-

enter image description here

enter image description here

Upvotes: 0

Views: 1123

Answers (2)

user3259926
user3259926

Reputation: 70

Create a JAR of your MapReduce program and put it inside your project. And then run it. This JAR gets submitted at the cluster

Upvotes: 1

user3020494
user3020494

Reputation: 732

You are using the old API (JobConf) , please use the new API (Job) http://blog.cloudera.com/blog/2012/12/how-to-run-a-mapreduce-job-in-cdh4/

Also you would need to compile your program to Jar else the part where you define, job.setJarByClass(YourMapReduce.class)

program will search for the jar file to distribute which is non existent.

Upvotes: 1

Related Questions