Reputation: 1
I am trying to load text/csv files in hive scripts with oozie and schedule it on daily basis. Text files are in local unix file system.
I need to put those text files into hdfs before executing the hive scripts in a oozie workflow.
In a real time cluster we don't know job will run on which node.it will run randomly in any one of the node in cluster.
can any one provide me the solution.
Thanks in advance.
Upvotes: 0
Views: 985
Reputation: 9067
Not sure I understand what you want to do.
The way I see it, it can't work:
So why don't you load your files into HDFS beforehand? The transfer may be triggered either when the files are available (post-processing action in the upstream job) or at fixed time (using Linux CRON).
You don't even need the Hadoop libraries on the Linux box if the WebHDFS service is active on your NameNode - just use CURL and a HTTP upload.
Upvotes: 1