HADOOP - Word Count Example for 1.2.1 Stable

Question

I am working through a word count example for hadoop 1.2.1. But something must have changed, because I cant seem to get it to work.

Here is my Reduce class:

public static class Reduce extends Reducer {

    public void reduce(WritableComparable key,
                       Iterator values,
                       OutputCollector output,
                       Reporter reporter) throws IOException {

        output.collect(key, NullWritable.get());

    }

}

And my main function:

public static void main(String[] args) throws Exception {

    JobConf jobConf = new JobConf(MapDemo.class);

    jobConf.setNumMapTasks(10);
    jobConf.setNumReduceTasks(1);

    jobConf.setJobName("MapDemo");

    jobConf.setOutputKeyClass(Text.class);
    jobConf.setOutputValueClass(NullWritable.class);

    jobConf.setMapperClass(Map.class);
    jobConf.setReducerClass(Reduce.class);

    jobConf.setInputFormat(TextInputFormat.class);
    jobConf.setOutputFormat(TextOutputFormat.class);

    FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
    FileOutputFormat.setOutputPath(jobConf, new Path(args[1]));

    JobClient.runJob(jobConf);
}

My IDE is telling me there is an error, corroborated by Maven:

[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] com/example/mapreduce/MapDemo.java:[71,16] method setReducerClass in class org.apache.hadoop.mapred.JobConf cannot be applied to given types;
required: java.lang.Class
found: java.lang.Class
reason: actual argument java.lang.Class cannot be converted to java.lang.Class by method invocation conversion
[INFO] 1 error
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.679s
[INFO] Finished at: Mon Sep 16 09:23:08 PDT 2013
[INFO] Final Memory: 17M/202M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile) on project inventory: Compilation failure
[ERROR] com/example/mapreduce/MapDemo.java:[71,16] method setReducerClass in class org.apache.hadoop.mapred.JobConf cannot be applied to given types;
[ERROR] required: java.lang.Class
[ERROR] found: java.lang.Class

I believe the word count examples online are out of date for 1.2.1. How do I fix this? Does anyone have a link to a working 1.2.1 word count java source?

Tariq · Accepted Answer

Which link have you followed? I have never seen this kind of WC. But whatever you have followed is definitely outdated since it is making use of the old API. And I doubt if you have followed it properly.

This should work :

public class WordCount {
    /**
     * The map class of WordCount.
     */
    public static class TokenCounterMapper extends
            Mapper {

        private final static IntWritable one = new IntWritable(1);
        private Text word = new Text();

        public void map(Object key, Text value, Context context)
                throws IOException, InterruptedException {              

            StringTokenizer itr = new StringTokenizer(value.toString());
            while (itr.hasMoreTokens()) {
                word.set(itr.nextToken());
                context.write(word, one);
            }
        }
    }

    /**
     * The reducer class of WordCount
     */
    public static class TokenCounterReducer extends
            Reducer {
        public void reduce(Text key, Iterable values,
                Context context) throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable value : values) {
                sum += value.get();
            }
            context.write(key, new IntWritable(sum));
        }
    }

    /**
     * The main entry point.
     */
    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        conf.addResource(new Path("/Users/miqbal1/hadoop-eco/hadoop-1.1.2/conf/core-site.xml"));
        conf.addResource(new Path("/Users/miqbal1/hadoop-eco/hadoop-1.1.2/conf/hdfs-site.xml"));
        conf.set("fs.default.name", "hdfs://localhost:9000");
        conf.set("mapred.job.tracker", "localhost:9001");
        Job job = new Job(conf, "WordCount");
        job.setJarByClass(WordCount.class);
        job.setMapperClass(TokenCounterMapper.class);
        job.setReducerClass(TokenCounterReducer.class);
        job.setNumReduceTasks(2);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path("/inputs/demo.txt"));
        FileOutputFormat.setOutputPath(job, new Path("/outputs/1111223"));
        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}

Few observations to make :

You are not emitting any count as I can see NullWritable getting emitted from your Reducer. It will just emit the key without any count.
Use proper types for your input and output keys/values.
Use the new API. It is cleaner and better.

HADOOP - Word Count Example for 1.2.1 Stable

Answers (2)

Related Questions