Swarnitha
Swarnitha

Reputation: 45

How to pass RunProperties while calling the glue workflow using boto3 and python in lambda function?

My python code in lambda function:

import json

import boto3

from botocore.exceptions import ClientError


glueClient = boto3.client('glue')

default_run_properties = {'s3_path': 's3://bucketname/abc.zip'}

response = glue_client.start_workflow_run(Name="Testing",RunProperties=default_run_properties)

print(response)

I am getting error like this:

"errorMessage": "Parameter validation failed:\nUnknown parameter in input: \"RunProperties\", must be one of: Name",
  "errorType": "ParamValidationError",

I also tried like this :

session = boto3.session.Session()
glue_client = session.client('glue')

But got the same error.

can anyone tell how to pass the RunProperties while calling the glue workflow to run .The RunProperties are dynamic need to be passed from lambda event.

Upvotes: 1

Views: 1167

Answers (2)

elshev
elshev

Reputation: 1543

I had the same problem and asked in AWS re:Post. The problem is the old boto3 version used in Lambda. They recommended two ways to work around this issue:

  1. Update run properties for a Job immediately after start_workflow_run:
default_run_properties = {'s3_path': 's3://bucketname/abc.zip'}

response = glue_client.start_workflow_run(Name="Testing")

updateRun = glue_client.put_workflow_run_properties(
    Name = "Testing",
    RunId = response['RunId'],
    RunProperties = default_run_properties
)
  1. Or you can create a lambda layer for your lambda function and include a new boto3 version there.

Upvotes: 1

Uwe Bretschneider
Uwe Bretschneider

Reputation: 1281

I had the same issue and this is a bit tricky. I do not like my solution, so maybe someone else has a better idea? See here: https://github.com/boto/boto3/issues/2580 And also here: https://docs.aws.amazon.com/glue/latest/webapi/API_StartWorkflowRun.html

So, you cannot pass the parameters when starting the workflow, which is a shame in my opinion, because even the CLI suggests that: https://docs.aws.amazon.com/cli/latest/reference/glue/start-workflow-run.html

However, you can update the parameters before you start the workflow. These values are then set for everyone. If you expect any "concurrency" issues then this is not a good way to go. You need to decide, if you reset the values afterwards or just leave it to the next start of the workflow.

I start my workflows like this:

glue_client.update_workflow(
    Name=SHOPS_WORKFLOW_NAME,
    DefaultRunProperties={
        's3_key': file_key,
        'market_id': segments[0],
    },
)

workflow_run_id = glue_client.start_workflow_run(
    Name=SHOPS_WORKFLOW_NAME
)

This basically produces the following in the next run: enter image description here

Upvotes: 3

Related Questions