Kal
Kal

Reputation: 161

How to save only non empty reducers' output in HDFS

In my application the reducer saves all the part files in HDFS but I want only the reducer will write the part files whose sizes are not 0bytes.Please let me know how to define it.

Upvotes: 4

Views: 2147

Answers (2)

Jake Biesinger
Jake Biesinger

Reputation: 5828

If you're using the old API, you can use the NullOutputFormat class:

import org.apache.hadoop.mapred.lib.NullOutputFormat;
conf.setOutputFormat(NullOutputFormat.class);

Upvotes: -1

Katja Mueller
Katja Mueller

Reputation: 61

It is possible - see the documentation section on "Lazy Output":

http://hadoop.apache.org/mapreduce/docs/current/mapred_tutorial.html#Lazy+Output+Creation

import org.apache.hadoop.mapreduce.lib.output.LazyOutputFormat;
LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class); 

Upvotes: 6

Related Questions