How to output files with a specific extension (like .csv) in Hadoop, using MultipleOutputs class

Question

I currently have a MapReduce program that uses MultipleOutputsto output the result into multiple files. The reducer looks like this:

private MultipleOutputs mo = new MultipleOutputs(context);
...
public void reduce(Edge keys, Iterable values, Context context)
            throws IOException, InterruptedException {
        String date = records.formatDate(millis);
        out.set(keys.get(0) + "	" + keys.get(1));
        parser.parse(key); 
        String filePath = String.format("%s/part", parser.getFileID());
        mo.write(noval, out, filePath);
    }

This is very similar to the example in the book Hadoop: The Definitive Guide - however, the problem is that it outputs the files as plain text. I want my files to be outputted as .csv files and haven't managed to find an explanation on it in the book or online.

How can this be done?

How to output files with a specific extension (like .csv) in Hadoop, using MultipleOutputs class

Answers (1)

Related Questions