Reputation: 608
I have tool which chains a lot of Mappers & Reducers, and at some point I need merge results from previous map-reduce steps, for example as input I have two files with data:
/input/a.txt
apple,10
orange,20
*/input/b.txt*
apple;5
orange;40
result should be c.txt, where c.value = a.value * b.value
/output/c.txt
apple,50 // 10 * 5
orange,800 // 40 * 20
How it could be done? I've resolved this with introducing simple Key => MyMapWritable (type=1,2, value), and merging (actually, multiplying) data in reducers. It works, but:
Upvotes: 3
Views: 6976
Reputation: 5791
String fileName = ((FileSplit) context.getInputSplit()).getPath()
.toString();
if (fileName.contains("file_1")) {
//TODO for file 1
} else {
//TODO for file 2
}
Upvotes: 1
Reputation: 30089
Assuming they have been partitioned and sorted in the same way, then you can use the CompositeInputFormat to perform a map-side-join. There's an article on using it here. I don't think it's been ported to the new mapreduce api though.
Secondly, you can get the input file in the mapper by calling context.getInputSplit()
, this will return the InputSplit, which if you're using TextInputFormat
, you can cast to a FileInputSplit
and then call getPath()
to get the file name. I don't think you can use this method with CompositeInputFormat though as you won't know where the Writables in the TupleWritable have come from.
Upvotes: 3