Reputation: 1332
I have a method which as an argument has an iterator to the collection. Inside the method I want to copy the collection the iterator is "pointing to". However only the last collection entry is present in the collection copy, it is present N times, where N is the size of the original collection.
public void someMethod(Iterator<Node> values) {
Vector<Node> centralNodeNeighbourhood = new Vector<Node>();
while (values.hasNext()) {
Node tmp = values.next();
centralNodeNeighbourhood.add(tmp);
}
...
//store the centralNodeNeighbourhood on disk
}
Exemplar "original collection":
1
2
3
Exemplar "centralNodeNeighbourhood collection":
3
3
3
Can someone point me to my mistake? I can not change the method args, I only get the Iterator to the collection, can't do anything about it.
UPDATE (Answer to some questions)
while (values.hasNext()) {
Node tmp = values.next();
System.out.print("Adding = "+tmp.toString());
centralNodeNeighbourhood.add(tmp);
}
Prints proper original collection elements. I don't know what type is the original collection, but the Iterator is from std java. The method is the
public class GatherNodeNeighboursInfoReducer extends MapReduceBase
implements Reducer<IntWritable, Node, NullWritable, NodeNeighbourhood>{
public void reduce(IntWritable key, Iterator<Node> values,
OutputCollector<NullWritable, NodeNeighbourhood> output, Reporter reporter) throws IOException {...}
}
method from OLD Hadoop api (Hadoop version 0.20.203.0)
SOLVED I made a copy of tmp object at each iteration, and I add this copy to the centralNodeNeighbourhood collection. This solved my problem. Thx for all your (fast) help.
Upvotes: 1
Views: 865
Reputation: 110056
Hadoop's reduce method specifies that it reuses the value objects in its iterator. That's a terrible thing to do, but that's what it does.
The framework will reuse the key and value objects that are passed into the reduce, therefore the application should clone the objects they want to keep a copy of. In many cases, all values are combined into zero or one value.
Upvotes: 1
Reputation: 533520
Its appears that the Iterator is returning the same Node object each time. If so, you need to take a copy of the Node before adding it to the collection. (Otherwise you will be adding the same object each time and it will have the last values it was set to)
Upvotes: 3