user3690321
user3690321

Reputation: 65

Implementing WritableComparable for Hadoop

I have implemented WritableComparable for my map job and have passed three values to it.

public class KeyCustom implementsw WritableComparable<KeyCustom>
{
   private Text placeOfBirth;
   private Text country;
   private LongWritable age;
   //Implemented constructors and set methods, write, readFields, hashCode and equals
   @Override
   public int compareTo(KeyCustom arg0)
   {
      return placeOfBirth.compareTo(arg0.placeOfBirth);
   }
}

But then when I log these three fields in my reducer I can clearly see that all the people with the same country are being grouped together. It would be great if someone could help me out so that all my reducers get the people with the same place of birth. I dont' know how to do this or if my compareTo function is wrong.

Thanks for all the help.

Upvotes: 2

Views: 2886

Answers (2)

user2458922
user2458922

Reputation: 1721

I would say you have two option

1) A Custom Partioner, as discussed above ?

OR 2) Overwride HashCode() as

@Override  public int hashCode() {
    return placeOfBirth.hashCode();
}

Reason

The Default partitioner class work upon the HashCode of the writableComaparable. Hence, for a custom WritableComparable, you need to have either a HashCode() overidden, which enables the Partioner to seggreate the maps output to reducers. Or you could implement and assign your own partioner class to the job which would consider only the palceOfBirthField for partioning.

Upvotes: 1

Roman Nikitchenko
Roman Nikitchenko

Reputation: 13036

You're trying to solve your task with wrong approach. What you really need is to implement proper partitioner.

By the way you don't need special compareTo() implementation to do special partitioning.

UPDATE:

Try just to change partitioner to TotalOrderPartitioner in your job and probably your issue will be solved. Here is not bad example of what it should look alike.

Upvotes: 3

Related Questions