Gyanendra Dwivedi
Gyanendra Dwivedi

Reputation: 5557

Map-Reduce Program : Mapper not behaving as expected

Friends,

I am new to Map-Reduce and trying my hand with one example which only executes a Mapper; but the output is strange and not expected. Please help me finding, if I am missing something here:

Code part:

Imports:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

Driver Program

Job job = new Job(conf,"SampleProgram");
job.setJarByClass(SampleMR.class);     // class that contains mapper and reducer
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);    // reducer class

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setNumReduceTasks(0);
FileInputFormat.setInputPaths(job, new Path("/tmp/"));
FileOutputFormat.setOutputPath(job, new Path("/tmp/out"));  // adjust directories as required

job.submit();

boolean b = job.waitForCompletion(true);
if (!b) {
    throw new IOException("error with job!");
}

Mapper Program

public static class MyMapper extends Mapper<LongWritable, Text, Text, Text>  {
@Override
        public void map(LongWritable idx , Text value, Context context) throws IOException, InterruptedException {
            String[] tokens = value.toString().split("|");
            String keyPrefix = tokens[0] + tokens[1];
            context.write(new Text(keyPrefix), value);
        }
    }

There is a reducer phase as well, but I have set reducer to 0 to debug the issue. Here the mapper is not behaving correctly.

For the Input

379782759851005|ABCDEFG|name:YOLO|top:44.7|avgtop:19.2

The expected Map output is

379782759851005ABCDEFG [Blank Space] 379782759851005|ABCDEFG|name:YOLO|top:44.7|avgtop:19.2

Output my Mapper

3 [Blank Space] 379782759851005|ABCDEFG|name:YOLO|top:44.7|avgtop:19.2

Looks like, the Key is printing just first letter of the expected output. Same is happening with value as well, if I try to add tokens[4] as value to the context. Looks like there is something happening while spliting the string. Any Insight, what could be going wrong?

Upvotes: 0

Views: 82

Answers (1)

mangusta
mangusta

Reputation: 3554

you need to escape the pipe character. see the link below:

Splitting string with pipe character ("|")

Upvotes: 1

Related Questions