Reputation: 131
I have chained 2 mappers followed by 1 reducer. Is it possible to write the intermediate outputs (o/p of each mapper in the chain) to HDFS? I tried setting the OutputPath for each, but it doesnt seem to work. Now, am not sure if it can be done at all. Any suggestions?
Upvotes: 1
Views: 1254
Reputation: 20969
The result is always written to HDFS as a SequenceFile. But if you are using a reducer, these guys are just temp-files and they get deleted after job completion. If you need the map output, you have to chain two jobs. One job with no reducer, and a job with a reducer. Or if you have a bit skill in writing hdfs files out of a map task, this is also possible.
The first approach is non-coded, but the second is. It's up to you!
Upvotes: 3