fTTTTT
fTTTTT

Reputation: 163

Change the default delimiter of the mapreduce

Hi I am a beginner to MapReduce, and I want to program the WordCount so it output the K/V pairs. But the question is I don't want to use the 'tab' as the key value pair delimiter for the file. How could I change it?

The code I use is slightly different from the example one. Here is the driver class.

    Configuration conf = new Configuration();
    Job job = Job.getInstance(conf, "Job1");
    job.setJarByClass(Simpletask.class);
    job.setMapperClass(TokenizerMapper.class);
    //job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(IntWritable.class);
    job.setOutputValueClass(Text.class);
    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class);

Since I want the file name to be respective with the partition of the reducer, I use multipleout.write() in the reduce function, and thus the code is slightly different.

public void reduce(IntWritable key,Iterable<Text> values, Context context) throws IOException, InterruptedException {
    String accu = "";
    for (Text val : values) {
        String[] entry=val.toString().split(",");
        String MBR = entry[1];
        //ASSUME MBR IS ENTRY 1. IT CAN BE REPLACED BY INVOKING FUNCTION TO CALCULATE MBR([COORDINATES])
        String mes_line = entry[0]+",MBR"+MBR+" ";
        result.set(mes_line);
        mos.write(key, result, generateFileName(key));
    }

Any help will be appreciated! Thank you!

Upvotes: 0

Views: 575

Answers (1)

Ramzy
Ramzy

Reputation: 7148

Since you are using FileInputFormat the key is the line offset in the file, and the value is a line from the input file. It's upto the mapper to split the input line with any delimiter. You can use it to split the record read in map method. The default behavior comes with a specific input format like TextInputFormat etc.

Upvotes: 0

Related Questions