user1819076
user1819076

Reputation:

Strange HashMap results - Java,Hadoop

I have written this simple code which is equal to the default run method in Reducer Class but something totally weird happens.

Here is the default run method:

public void More ...run(Context context) throws IOException, InterruptedException {
   setup(context);
   while (context.nextKey()) {
       reduce(context.getCurrentKey(), context.getValues(), context);
   }
   cleanup(context);
 }

output:

New reducer: 0
Reducer: 0:9,2:5
end of this reducer

Reducer: 0:9,5:7
end of this reducer

... (lots of keys)

Reducer: 7:7,6:7
end of this reducer

Reducer: 7:7,7:6
end of this reducer

and here is my overridden method:

@Override
    public void run(Context context) throws IOException, InterruptedException {
        setup(context);

        HashMap<Text,HashSet<Text>> map = new HashMap<Text,HashSet<Text>>();

        while (context.nextKey()) {
            //reduce(context.getCurrentKey(),context.getValues(),context);
            Text key = context.getCurrentKey();
            map.put(key, new HashSet<Text>());
            for(Text v : context.getValues()){
                map.get(key).add(v);
            }
        }

        for(Text k : map.keySet()){
            reduce(k,map.get(k),context);
        }
        cleanup(context);
    }

output:

New reducer: 0

Reducer: 7:7,7:6
end of this reducer

... (lots of keys)

Reducer: 7:7,7:6
end of this reducer

my problem is that if I copy the keys and values to the hashmap first nothing works properly and in the reduce call in the end it passes the same key (the first who stored in the hashmap) again and again :/ Can anyone help me? How can I do this work properly? I need to do this because I want to pre-process the keys before send them to the reducers. Thanks in advance!

Upvotes: 0

Views: 741

Answers (1)

Thomas Jungblut
Thomas Jungblut

Reputation: 20969

Hadoop reuses the Writable objects. So you need to create new ones before putting them into your collection.

Changing your code to copy things would look like this:

while (context.nextKey()) {
        Text key = new Text(context.getCurrentKey());
        map.put(key, new HashSet<Text>());
        for(Text v : context.getValues()){
            map.get(key).add(new Text(v));
        }
}

Upvotes: 1

Related Questions