Reputation: 3085
I am working on creating a Sagemaker pipeline. In the evaluation step, I would like to pass an argument to my preprocess.py
script.
There are a few examples online of how to do so (a sample below) but they all use static values. I want to pass a Workflow parameter (string in this case) to the script.
I tried multiple approaches but to no avail, and I even opened a Github Issue but received no response so far.
The linked Github Issue details all approaches I've taken so far, but it all boils down to the fact that a workflow parameter is only evaluated at runtime.
I would like to know if what I want to do is possible or not.
sklearn_processor.run(
code="preprocess.py",
inputs = [
ProcessingInput(source = 'my_package/', destination = '/opt/ml/processing/input/code/my_package/')
],
outputs=[
ProcessingOutput(output_name="test_transform_data",
source = '/opt/ml/processing/output/test_transform',
destination = out_path),
],
arguments=["--time-slot-minutes", "30min"]
)
source for the sample code: How to pass region to the SKLearnProcessor - botocore.exceptions.NoRegionError: You must specify a region
step_args=myprocessor.run(
inputs=[
ProcessingInput(source=s3_full_address, destination="/opt/ml/processing/input"),
],
outputs=[
ProcessingOutput(output_name="raw", source="/opt/ml/processing/train"),
ProcessingOutput(output_name="test", source="/opt/ml/processing/test"),
],
code="generate_train_test_data.py",
arguments=["--s3_prefix", s3_prefix]
)
Where s3_prefix
is a workflow argument defined as s3_prefix = ParameterString(name="InputPrefix", default_value="myprefix")
Upvotes: 2
Views: 733
Reputation: 3085
To pass a workflow argument to your script you can use the option
job_arguments
Update your step definition to add the argument job_arguments
ProcessingStep(
name="step-name",
processor=my_processor,
job_arguments=[
"--my_argument",my_argument
],
...
code=f"myscript.py"
)
In your script (myscript.py
in this example), add ready the argument as follows:
def parse_args():
parser = argparse.ArgumentParser()
# hyperparameters sent by the client are passed as command-line arguments to the script
parser.add_argument('--my_argument', type=str)
return parser.parse_known_args()
args, _ = parse_args()
args, _ = parse_args()
my_argument = args.my_argument
Upvotes: 1