Reputation: 8449
I'm looking for a simple method to chain post processing code after a map reduce job
specifically, in involves renaming\moving the out files create by org.apache.hadoop.mapred.lib.MultipleOutputs
(the class has limitations on the output file names, so I ca't produce the files directly in the mapreduce job)
The options I know (or think of) are:
Upvotes: 0
Views: 537
Reputation: 1399
Your "simple" task should be a Mapper-only job. Your Map() receives as key the file name and renames the file. For this you have to write your own InputFormat and RecordReader, like in the links, but your RecordReader should not actually read the file, just return the file name in getCurrentKey():
Upvotes: 1