HHH
HHH

Reputation: 6475

How to run oozie jobs in HDInsight cluster?

I have an oozie workflow that I'd like to run on an HDInsight cluster. My job has a jar file as well as a workflow.xml file that I store on the Azure blob storage. However the only way I found to store the job.config file is on the local storage of the HDInsight headnode. However my concern is what happens when the VM gets re-imaged? does it remove my job.config file?

Upvotes: 0

Views: 807

Answers (1)

Jennifer Marsman - MSFT
Jennifer Marsman - MSFT

Reputation: 5225

In general, you can use Script Actions on HDInsight. Script actions perform customization on the HDInsight clusters during provisioning. So every time the cluster is created, the scripts will be run. (You were smart to be concerned about what happens when the cluster is re-created!)

In these advanced configuration options, it shows HDInsight cluster customization during the provision process using PowerShell. There is an oozie section:

# oozie-site.xml configuration
$OozieConfigValues = new-object 'Microsoft.WindowsAzure.Management.HDInsight.Cmdlet.DataObjects.AzureHDInsightOozieConfiguration'
$OozieConfigValues.Configuration = @{ "oozie.service.coord.normal.default.timeout"="150" }  # default 120

Does that help?

Other resources:
Customizing HDInsight Cluster provisioning
Oozie tutorial on HDInsight

Upvotes: 1

Related Questions