How to use combiner, when the output VALUE of reducer is null?

Question

When I tried to use combiner in my MR job I am getting the below exception

java.lang.NullPointerException
at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:193)
at org.apache.hadoop.mapred.Task$CombineOutputCollector.collect(Task.java:1315)

at org.apache.hadoop.mapred.Task$NewCombinerRunner$OutputConverter.write(Task.java:1632)

The reason is, I am using null as my output VALUE in reducer class. Reducer Code :

public  static class reducer extends Reducer{
            public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException{
                context.write(key, null);
            }
    }

When I remove the combiner class job.setCombinerClass(reducer.class); job is getting successful.

How can I implement combiner, I need the same reducer output ie with only KEY as output?

Manjunath Ballur · Accepted Answer

This is not possible to achieve. The problem is the following piece of code in IFile.java:

public void append(K key, V value) throws IOException {
    .....

    if (value.getClass() != valueClass)
        throw new IOException("wrong value class: "+ value.getClass()
                          +" is not "+ valueClass);

    .....

In the append() function, there is a check:

if (value.getClass() != valueClass)

Since you are passing null as the value, the NullPointerException is thrown, when it tries to getClass() on a null value:

value.getClass()

So, even if you use NullWritable (which is again a class) and pass null, you will still get the NullPointerException.

Instead of passing null, you should manage by passing 0 (Zero).

How to use combiner, when the output VALUE of reducer is null?

Answers (1)

Related Questions