user5532529
user5532529

Reputation: 11

output of the map reduce values of one key together

I work on MapReduce with the Wordcount example: Input data:

text files

Output:

term: fileName  occurrences

Map output :

Term:filename 1 1 1 1 1

Reduce output:

Term: filename occurences

Example of the code final output "reducer output":

Iphone: file1 4
Iphone: file2 3
Galaxy: file1 2
Htc: file1 3
Htc file2 5

Output I want

Iphone: file1=4 file2=3
Galaxy: file1=2
Htc: file1=3 file2=5

How can I get this case, I thought about using the partitioning function, put I don't know how to do that? Any suggestion? Thanks in advance

Upvotes: 1

Views: 307

Answers (1)

siddhartha jain
siddhartha jain

Reputation: 1006

There are various ways to achieve the output you want but since you have mention about to do it with a partitioner let's do it with that.

According to your question you need to create a partitioner on key on basis of which you want to divide output which is "Term" (iphone, Galaxy etc) .I am assuming here that your map output key format and value format is text if not make changes accordingly. This is what your partitioner should look like

public class Partitioners extends org.apache.hadoop.mapreduce.Partitioner<Text,Text>{
// I have the written the code if there are 3 reducer(since you have 3 type of key).
//Tip: your number of reducers should be equal to the no of batches you want to divide your map output into.
    @Override
    public int getPartition(Text key, Text value, int numReduceTasks) {
                String Skey = key.toString();
        //Again make changes according to your requirement here but I think it will work according to the composite key you have mentioned
        String term = Skey.substring(0, Skey.indexOf(':'));
        if(term.equals("Iphone"))
        {   // this will send all the key having iphone in reducer 1
            return 0;
        }else if(term.equals("Galaxy"))
        {   // this will send all the key having Galaxy in reducer 2
            return 1;
        }
        else{
           // this will send all the key having other then Iphone and galaxy which is Htc in your case in reducer 3
            return 2;
        }
    }
}

Now once partitioner is done we need to inform our driver class about this thus add following in your driver class

job.setPartitionerClass(Partitioners.class);
job.setNumReduceTasks(3); //since we want 3 reducers

This will divide your map output in 3 partitioner and now you can reduce the output accordingly in reducer class.

I hope this solves your problem. If not let me know.

Upvotes: 1

Related Questions