How to sort values (with their corresponding key) in mapReduce Hadoop framework?

Question

I am trying to sort the input data I have using Hadoop mapReduce. The problem is that I am only able to sort the key-value pairs by key, while I am trying to sort them by value. Each value's key was created with a counter, so the first value (234) has key 1, and the second value (944) has key 2, etc. Any idea on how I can do it and order the input by values?


import java.io.IOException;
import java.util.StringTokenizer;
import java.util.ArrayList;
import java.util.List;
import java.util.Collections;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class Sortt {

  public static class TokenizerMapper
       extends Mapper{
    int k=0;
    int v=0;
    int va=0;
    public Text ke = new Text();
   private final static IntWritable val = new IntWritable();

    public void map(Object key, Text value, Context context) throws 
    IOException, InterruptedException 
{
      StringTokenizer itr = new StringTokenizer(value.toString());


        while (itr.hasMoreTokens()) 
{
        val.set(Integer.parseInt(itr.nextToken()));
        v=val.get();
        k=k+1;
        ke.set(Integer.toString(k));

        context.write(ke, new IntWritable(v));}
}


    }


  public static class SortReducer
       extends Reducer {
        int a=0;
        int v=0;
       private IntWritable va = new IntWritable();
    public void reduce(Text key, Iterable values,
                       Context context
                       ) throws IOException, InterruptedException {
    List sorted = new ArrayList();

    for (IntWritable val : values) {
           a= val.get();
          sorted.add(a);

}
    Collections.sort(sorted);
    for(int i=0;i



Input:

234

944

241

130

369

470

250

100

250

735

856

659

425

756

123

756

459

754

654

951

753

254

698

741

Expected Output: 

8   100

15  123

4   130

1   234

3   241

24  241

7   250

9   250

22  254

5   369

13  425

17  459

6   470

19  654

12  659

23  698

10  735

21  753

18  754

14  756

16  756

11  856

2   944

20  951

Current Output:

1   234

10  735

11  856

12  659

13  425

14  757

15  123

16  756

17  459

18  754

19  654

2   944

20  951

21  753

22  254

23  698

24  741

3   241

4   130

5   369

6   470

7   250

8   100

9   250

How to sort values (with their corresponding key) in mapReduce Hadoop framework?

Answers (1)

Related Questions