Reputation: 10931
Is it possible to use Oozie to concatenate the output of a MapReduce job into a single file? Lets say I have the output ...
part-r-00000
part-r-00001
part-r-00002
and I just want...
output.csv
I know I can pull them down as a single file with hadoop fs -getmerge
, but I'm curious if it's possible with a workflow application and HDFS.
Upvotes: 3
Views: 1260
Reputation: 1518
You can probably use pig or Java to call
or maybe add it to your own fork of Oozie's fs-action.
Alternatively, using webhdfs: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Concat_Files .
You could wrap that curl call in a shell or ssh action.
Upvotes: 0
Reputation: 30089
Two simple options i can think of:
Upvotes: 2