Reputation: 23
I want to write the Reducer result into a normal file (e.g. .csv or .log file) instead of writing into HDFS. So I use the following code in reducer class:
@Override
public void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException {
// Standard algorithm for finding the max value
long sum = 0;
for (LongWritable value : values) {
sum++;
}
context.write(key, new LongWritable(sum));
System.out.println(key + " : " + sum);
Main.map.put(key.toString(), sum);
}
And I print the map's content into a csv file in the Main class. However, after reducer finishing, the file is empty. I found the map is empty because in the reducer class it doesn't put anything into the map, also I cannot see any System.out.println(key + " : " + sum) in the reducer in the console.
How could that be? They are not processed in the reducer class?
Upvotes: 0
Views: 937
Reputation: 10931
Let's get down to the root of the issue here. Each map or reduce task is launched in its own Java Virtual Machine (JVM). These JVMs do not share memory with each other.
Lets say you have the following set up :
This is what happens :
Main.map<K,V>
Main.map<K,V>
but there's nothing there because jvm-2 wrote to a map in its own memory that jvm-1 won't see.A similar thing happens System.out
. It may not actually be attached to the stdout
stream. It's likely (if you have a multi-node setup), the output is going to another machine on the network.
Upvotes: 1