Reputation: 43
I read from this documentation that the ProcessingStep can accept job arguments.
I currently have a python script contaning a function to be executed via ProcessingStep that requires arguments to be parsed in. I am not sure how I can extract the arguments from the 'Job arguments' such that I can call the function in the python script with the arguments.
Here is an example of code snippet from my python script:
def params(input_params):
details = {"database": input_params[0],
"table": input_params[1],
"catalog": input_params[2],
"earliestday": int(input_params[3]),
"latestday": int(input_params[4]),
"s3bucket": input_params[5],
"bucketpath": input_params[6]}
return details
output_params = params(input_params) #this is where I'm not sure how I can extract the argument from the job arguments in the ProcessingStep to call my function here
Here's what my processing step code looks like:
step_params = ProcessingStep(
name="StateParams",
processor=sklearn_processor,
outputs = [processing_output],
job_arguments = ["ABC", "SESSION_123", "AwsDataCatalog", "5", "7", "mybucket", "bucket2/tmp/athena_sagemaker"], #This is the job argument I input which I hope will be parsed into my python file function
code = "params.py",
)
Would greatly appreciate if any of you can advise me on how I can go about using the job arguments in the ProcessingStep to successfully call the function in the python script, thanks!
Upvotes: 1
Views: 3236
Reputation: 576
You can pass the job argument to ProcessingStep like this:
sample_argument_1 = "sample_arg_1"
sample_argument_2 = "sample_arg_2"
step_params = ProcessingStep(
name="StateParams",
processor=sklearn_processor,
outputs = [processing_output],
job_arguments = ["--sample-argument-1", sample_argument_1, "--sample-argument-2", sample_argument_2],
code = "params.py",
)
In your params.py file you can use the arguments like this:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--sample-argument-1', type=str, dest='sample_argument_1')
parser.add_argument('--sample-argument-2', type=str, dest='sample_argument_2')
args = parser.parse_args()
arg_1 = args.sample_argument_1
print("arg_1:")
print(arg_1)
Upvotes: 3