Reputation: 41
With an Oozie workflow, you have to specify the cluster's JobTracker in the properties for the workflow. This is easy when you have a single JobTracker:
jobTracker=hostname:port
When the cluster is configured for HA (high availability) JobTracker, I need to be able to set up my properties files to be able to hit either of the JobTracker hosts, without having to update all my properties files when the JobTracker has failed over to the 2nd node.
When accessing one JobTracker through http, it will redirect to the other if it isn't running, but oozie doesn't use http, so there is no redirect, which causes the workflow to fail if the properties file specifies the job tracker host that is not running.
How can I configure my property file to handle JobTracker running in HA?
Upvotes: 4
Views: 2629
Reputation: 551
Please specify the nameservice for the cluster in which the HA is enabled. eg:
in properties file
namenode=hdfs://<nameserivce>
jobTracker=<nameservice>:8032
Upvotes: 0
Reputation: 106
I just finished setting up some Oozie workflows to use HA JobTrackers and NameNodes. The key is to use the logical name of the HA service you configured, and not any individual hostnames or ports. For example, the default HA JobTracker name is 'logicaljt'. Replace hostname:port with 'logicaljt', and everything should just work, as long as the node from which you're running Oozie has the appropriate hdfs-site and mapred-site configs properly installed (implicitly due to being part of the cluster, or explicitly due to adding a gateway role to it).
Upvotes: 2