Reputation: 6485
I'm new to Apache Oozie and as far as I understood the workflow/cordinator jobs must be pre-defined in an xml files. However, in my problem I need to dynamically define the workflow, that is depending on the input files I might have extra actions in my workflow. Is there any way to programmatically do that?
Upvotes: 2
Views: 1891
Reputation: 1529
I totally agree with Mzf's response, but want to add something to also answer jamiet's question and create a more general answer. If it is the case that it makes sense to split your workflow into multiple flows/cases then a Decision Control Node is the way to go.
Sometimes it is necessary however - like jamiet asked in the comments - that you want to call a workflow/action for an iteration over a collection. This means that each time you run a workflow the possible length of the workflow can vary from only 1 action to 100 depending on the collection. This is not something you can represent using simple decision control nodes. One of my use cases is the generation of one workflow to do the sqoop import for each of the table/database pairs present in a config file.
My solution for this problem is to have 1 workflow call a custom script possibly with some parameters. This script then builds the workflow.xml file of the 'dynamic workflow' containing the actions that correspond to your collection. Once the workflow.xml has been built, the script calls oozie job
with a job.properties files pointing to the newly created workflow.xml.
Upvotes: 2
Reputation: 5260
Workflow & coordinator are pre-defined files - that doesn't mean that that you can't control the workflow actions.
If you have several cases/flows in your workflow , you can add Decision Control Node to control the flow in your workflow.
For example let say that you say if I have input A it will do ActionA_1,ActionA_2 and for input case B will do ActionB_1, ActionB_2, use Decision Control Node choose execution path to follow
Upvotes: 1