Reputation: 247
I have two glue jobs, created from aws console. I would like to invoke one glue job (python) from another glue(python) job with parameters.what would be best approach to do this. I appreciate your help.
Upvotes: 1
Views: 4960
Reputation: 3173
You can use Glue workflows, and setup workflow parameters as mentioned by Bob Haffner. Trigger the glue jobs using the workflow. The advantage here is, if the second glue job fails due to any errors, you can resume / rerun only the second job after fixing the issues. The workflow parameter you can pass from one glue job to another as well. The sample code for read/write workflow parameters:
If first glue job:
args = getResolvedOptions(sys.argv, ['JOB_NAME', 'WORKFLOW_NAME', 'WORKFLOW_RUN_ID'])
workflow_name = args['WORKFLOW_NAME']
workflow_run_id = args['WORKFLOW_RUN_ID']
workflow_params = glue_client.get_workflow_run_properties(Name=workflow_name,RunId=workflow_run_id)["RunProperties"]
workflow_params['param1'] = param_value1
workflow_params['param2'] = param_value2
workflow_params['param3'] = param_value3
workflow_params['param4'] = param_value4
glue_client.put_workflow_run_properties(Name=workflow_name, RunId=workflow_run_id, RunProperties=workflow_params)
and in the second glue job:
args = getResolvedOptions(sys.argv, ['WORKFLOW_NAME', 'WORKFLOW_RUN_ID'])
workflow_name = args['WORKFLOW_NAME']
workflow_run_id = args['WORKFLOW_RUN_ID']
workflow_params = glue_client.get_workflow_run_properties(Name=workflow_name, RunId=workflow_run_id)["RunProperties"]
param_value1 = workflow_params['param1']
param_value2 = workflow_params['param2']
param_value3 = workflow_params['param3']
param_value4 = workflow_params['param4']
How to setup a glue workflow, refer here: https://docs.aws.amazon.com/glue/latest/dg/creating_running_workflows.html
https://medium.com/@pioneer21st/orchestrating-etl-jobs-in-aws-glue-using-workflow-758ef10b8434
Upvotes: 3