Reputation: 305
My Mapper task returns me following output:
2 c
2 g
3 a
3 b
6 r
I have written reducer code and keycomparator that produces the correct output but how do I get Top 3 out (top N by count) of Mapper Output:
public static class WLReducer2 extends
Reducer<IntWritable, Text, Text, IntWritable> {
@Override
protected void reduce(IntWritable key, Iterable<Text> values,
Context context) throws IOException, InterruptedException {
for (Text x : values) {
context.write(new Text(x), key);
}
};
}
public static class KeyComparator extends WritableComparator {
protected KeyComparator() {
super(IntWritable.class, true);
}
@Override
public int compare(WritableComparable w1, WritableComparable w2) {
// TODO Auto-generated method stub
// Logger.error("--------------------------> writing Keycompare data = ----------->");
IntWritable ip1 = (IntWritable) w1;
IntWritable ip2 = (IntWritable) w2;
int cmp = -1 * ip1.compareTo(ip2);
return cmp;
}
}
This is the reducer output:
r 6
b 3
a 3
g 2
c 2
The expected output from reducer is top 3 by count which is:
r 6
b 3
a 3
Upvotes: 0
Views: 3733
Reputation: 1496
If your Top-N elements could be stored in memory, you could use a TreeMap to store the Top-N elements and if your process could be aggregated using only one reducer.
map.firstKey()
. If your current value is bigger than the lowest value in the Tree then insert the current value into the treemap, map.put(value, Item)
and then delete the lowest value from the Tree map.remove(value)
.Note: The value to compare your records must be the key in your TreeMap. And the value of your TreeMap should be the description, tag, letter, etc; related with the number.
Upvotes: 1
Reputation: 2221
Restrict your output from reducer. Something like this.
public static class WLReducer2 extends
Reducer<IntWritable, Text, Text, IntWritable> {
int count=0;
@Override
protected void reduce(IntWritable key, Iterable<Text> values,
Context context) throws IOException, InterruptedException {
for (Text x : values) {
if (count > 3)
context.write(new Text(x), key);
count++;
}
};
}
Set number of reducers to 1. job.setNumReduceTasks(1)
.
Upvotes: 3