N9909
N9909

Reputation: 247

invoke glue job from another glue job

I have two glue jobs, created from aws console. I would like to invoke one glue job (python) from another glue(python) job with parameters.what would be best approach to do this. I appreciate your help.

Upvotes: 1

Views: 4960

Answers (1)

Yuva
Yuva

Reputation: 3173

You can use Glue workflows, and setup workflow parameters as mentioned by Bob Haffner. Trigger the glue jobs using the workflow. The advantage here is, if the second glue job fails due to any errors, you can resume / rerun only the second job after fixing the issues. The workflow parameter you can pass from one glue job to another as well. The sample code for read/write workflow parameters:

If first glue job:

args = getResolvedOptions(sys.argv, ['JOB_NAME', 'WORKFLOW_NAME', 'WORKFLOW_RUN_ID'])
workflow_name = args['WORKFLOW_NAME']
workflow_run_id = args['WORKFLOW_RUN_ID']
workflow_params = glue_client.get_workflow_run_properties(Name=workflow_name,RunId=workflow_run_id)["RunProperties"]

workflow_params['param1'] = param_value1
workflow_params['param2'] = param_value2
workflow_params['param3'] = param_value3
workflow_params['param4'] = param_value4
glue_client.put_workflow_run_properties(Name=workflow_name, RunId=workflow_run_id, RunProperties=workflow_params)

and in the second glue job:

args = getResolvedOptions(sys.argv, ['WORKFLOW_NAME', 'WORKFLOW_RUN_ID'])
workflow_name = args['WORKFLOW_NAME']
workflow_run_id = args['WORKFLOW_RUN_ID']
workflow_params = glue_client.get_workflow_run_properties(Name=workflow_name, RunId=workflow_run_id)["RunProperties"]

param_value1 = workflow_params['param1']
param_value2 = workflow_params['param2']
param_value3 = workflow_params['param3']
param_value4 = workflow_params['param4']

How to setup a glue workflow, refer here: https://docs.aws.amazon.com/glue/latest/dg/creating_running_workflows.html

https://medium.com/@pioneer21st/orchestrating-etl-jobs-in-aws-glue-using-workflow-758ef10b8434

Upvotes: 3

Related Questions