Meg
Meg

Reputation: 131

hadoop chain map/reduce

I have chained 2 mappers followed by 1 reducer. Is it possible to write the intermediate outputs (o/p of each mapper in the chain) to HDFS? I tried setting the OutputPath for each, but it doesnt seem to work. Now, am not sure if it can be done at all. Any suggestions?

Upvotes: 1

Views: 1254

Answers (1)

Thomas Jungblut
Thomas Jungblut

Reputation: 20969

The result is always written to HDFS as a SequenceFile. But if you are using a reducer, these guys are just temp-files and they get deleted after job completion. If you need the map output, you have to chain two jobs. One job with no reducer, and a job with a reducer. Or if you have a bit skill in writing hdfs files out of a map task, this is also possible.
The first approach is non-coded, but the second is. It's up to you!

Upvotes: 3

Related Questions