Reputation: 21
we want to built a pipeline in Azure Machine Learning Studio. Within the pipeline script we want to pass arguments via PythonScriptStep() to the scoring.py file. This is working so far as you can see in the code below.
scoring_step1 = PythonScriptStep(
name="Scoring_Step1",
source_directory='./source_directory',
script_name="scoring1.py",
arguments=[
'--input-data-label', input_data_label,
'--input-data-folder', input_datasetfiles.as_named_input(input_data_label).as_mount(),
'--output-data-folder-name', cache_folder_name,
'--output-data-folder', cache_data_folder,
'--message', "This is a test message"
],
outputs=[cache_data_folder],
compute_target=aml_compute,
runconfig=pipeline_run_config,
allow_reuse=False
)
To call the parameters within the scoring.py file we use the ArgumentParser as you can see below.
# Get parameters
parser = argparse.ArgumentParser()
# argument names need to have the same name when you call them somewhere else
parser.add_argument("--input-data-label", type=str, dest='input_data_label', default='input_data_label', help='given input data label')
parser.add_argument("--input-data-folder", type=str, dest='not_important_when_as_mount', help='arbitrary text')
parser.add_argument('--output-data-folder-name', type=str, dest='output_data_folder_name', default='output_data_folder_name',
help='given output data folder name')
parser.add_argument('--output-data-folder', type=str, dest='output_data_folder', default='output_data_folder', help='Folder for results')
parser.add_argument('--message', type=str, dest='message', default='Hallo Alex', help='Folder for results')
args = parser.parse_args()
input_data_label = args.input_data_label
output_data_folder_name = args.output_data_folder_name
output_data_folder_path = args.output_data_folder # get the path to the datafolder when using "PipelineData"
message = args.message
This is working alright so far. When we additional passing the "ds_consumption" argument, which we need for further processing, the passing is NOT working anymore. You can see the code below.
customer_dataset = Dataset.get_by_name(ws, name='Dreg_Test')
pipeline_param = PipelineParameter(name="pipeline_param", default_value=customer_dataset)
ds_consumption = DatasetConsumptionConfig("dataset", pipeline_param)
scoring_step2 = PythonScriptStep(
name="Scoring_Step2",
source_directory='./source_directory',
script_name="scoring2.py",
arguments=[
'--ds_consumption', ds_consumption,
'--input-data-label', input_data_label,
'--input-data-folder', input_datasetfiles.as_named_input(input_data_label).as_mount(),
'--output-data-folder-name', cache_folder_name,
'--output-data-folder', cache_data_folder,
'--message', "This is a test message"
],
inputs=[ds_consumption],
outputs=[cache_data_folder],
compute_target=aml_compute,
runconfig=pipeline_run_config,
allow_reuse=False
)
# Get the experiment run context
run = Run.get_context()
dataset = run.input_datasets["dataset"]
data = dataset.to_pandas_dataframe()
# End the run
run.complete()
We getting the following error message when running scoring2.py at the lineargs = parser.parse_args
Can anybody help us with this specific problem. We don't see why the argument passing is not working anymore when passing the "ds_consumption" object.
Thank you very much for any help!
Upvotes: 2
Views: 370
Reputation: 1683
There is a small suggestable change in the code block.
With respect to the above code block, change it with the below snippet and execute.
file_dataset = Dataset.File.from_files('your_file_path')
file_pipeline_param = PipelineParameter(name="file_ds_param", default_value=file_dataset)
ds_consumption = DatasetConsumptionConfig("input_1", file_pipeline_param).as_mount()
experiment.submit(ScriptRunConfig(source_directory, arguments=[dataset_input]))
Kindly update the result after checking.
Upvotes: 1