Reputation: 276
I'm testing a simple mapreduce application, but I'm getting a little stuck trying to understand what happen when I iterate over input values of a reduce call.
This is the piece of code which behaves strangely..
public void reduce(Text key, Iterable<E> values, Context context)
throws IOException, InterruptedException{
Iterator<E> iterator = values.iterator();
E first = (E)statesIter.next();
while(statesIter.hasNext()){
E state = statesIter.next();
System.out.println(first.toString());
// some other stuff
}
// some other stuff
}
so nothing strange.. except the fact that each println invocation actually prints a different string. So, every time I call the next()
method, the object referenced by first
changes.
So why this strange behavior?
Upvotes: 1
Views: 1205
Reputation: 1
It's somewhat counter-intuitive, but it's actually documented in the API docs -- Hadoop reuses the keys / values, you should clone them if you want to keep them around.
Upvotes: 4