Implementing WritableComparable for Hadoop

Question

I have implemented WritableComparable for my map job and have passed three values to it.

public class KeyCustom implementsw WritableComparable
{
   private Text placeOfBirth;
   private Text country;
   private LongWritable age;
   //Implemented constructors and set methods, write, readFields, hashCode and equals
   @Override
   public int compareTo(KeyCustom arg0)
   {
      return placeOfBirth.compareTo(arg0.placeOfBirth);
   }
}

But then when I log these three fields in my reducer I can clearly see that all the people with the same country are being grouped together. It would be great if someone could help me out so that all my reducers get the people with the same place of birth. I dont' know how to do this or if my compareTo function is wrong.

Thanks for all the help.

Roman Nikitchenko · Accepted Answer

You're trying to solve your task with wrong approach. What you really need is to implement proper partitioner.

Here is detailed example of writing custom partitioner.
Here is basic class API for MapReduce partitioner.

By the way you don't need special compareTo() implementation to do special partitioning.

UPDATE:

Try just to change partitioner to TotalOrderPartitioner in your job and probably your issue will be solved. Here is not bad example of what it should look alike.

Implementing WritableComparable for Hadoop

Answers (2)

Related Questions