randomnickname
randomnickname

Reputation: 21

Passing Arguments through PythonScriptStep() in AMLS

we want to built a pipeline in Azure Machine Learning Studio. Within the pipeline script we want to pass arguments via PythonScriptStep() to the scoring.py file. This is working so far as you can see in the code below.

 scoring_step1 = PythonScriptStep(
    name="Scoring_Step1",
    source_directory='./source_directory',
    script_name="scoring1.py",
    arguments=[
    '--input-data-label', input_data_label,
    '--input-data-folder', input_datasetfiles.as_named_input(input_data_label).as_mount(),
    '--output-data-folder-name', cache_folder_name, 
    '--output-data-folder', cache_data_folder,
    '--message', "This is a test message"
    ],
    outputs=[cache_data_folder],
    compute_target=aml_compute,
    runconfig=pipeline_run_config,
    allow_reuse=False
)

To call the parameters within the scoring.py file we use the ArgumentParser as you can see below.

# Get parameters
parser = argparse.ArgumentParser()

# argument names need to have the same name when you call them somewhere else
parser.add_argument("--input-data-label", type=str, dest='input_data_label', default='input_data_label', help='given input data label')
parser.add_argument("--input-data-folder", type=str, dest='not_important_when_as_mount', help='arbitrary text')
parser.add_argument('--output-data-folder-name', type=str, dest='output_data_folder_name', default='output_data_folder_name', 
                    help='given output data folder name')
parser.add_argument('--output-data-folder', type=str, dest='output_data_folder', default='output_data_folder', help='Folder for results')
parser.add_argument('--message', type=str, dest='message', default='Hallo Alex', help='Folder for results')

args = parser.parse_args()
input_data_label = args.input_data_label
output_data_folder_name = args.output_data_folder_name
output_data_folder_path = args.output_data_folder # get the path to the datafolder when using "PipelineData"
message = args.message

This is working alright so far. When we additional passing the "ds_consumption" argument, which we need for further processing, the passing is NOT working anymore. You can see the code below.


customer_dataset = Dataset.get_by_name(ws, name='Dreg_Test')
pipeline_param = PipelineParameter(name="pipeline_param", default_value=customer_dataset)                                                  
ds_consumption = DatasetConsumptionConfig("dataset", pipeline_param)

scoring_step2 = PythonScriptStep(
    name="Scoring_Step2",
    source_directory='./source_directory',
    script_name="scoring2.py",
    arguments=[
    '--ds_consumption', ds_consumption,
    '--input-data-label', input_data_label,
    '--input-data-folder', input_datasetfiles.as_named_input(input_data_label).as_mount(),
    '--output-data-folder-name', cache_folder_name, 
    '--output-data-folder', cache_data_folder,
    '--message', "This is a test message"
    ],
    inputs=[ds_consumption],
    outputs=[cache_data_folder],
    compute_target=aml_compute,
    runconfig=pipeline_run_config,
    allow_reuse=False
)

# Get the experiment run context
run = Run.get_context()
dataset = run.input_datasets["dataset"]
data = dataset.to_pandas_dataframe()

# End the run
run.complete()

We getting the following error message when running scoring2.py at the lineargs = parser.parse_args

enter image description here

Can anybody help us with this specific problem. We don't see why the argument passing is not working anymore when passing the "ds_consumption" object.

Thank you very much for any help!

Upvotes: 2

Views: 370

Answers (1)

Sairam Tadepalli
Sairam Tadepalli

Reputation: 1683

There is a small suggestable change in the code block.

  • Current code block

enter image description here

With respect to the above code block, change it with the below snippet and execute.

file_dataset = Dataset.File.from_files('your_file_path')
   file_pipeline_param = PipelineParameter(name="file_ds_param", default_value=file_dataset)
  ds_consumption = DatasetConsumptionConfig("input_1", file_pipeline_param).as_mount()
   experiment.submit(ScriptRunConfig(source_directory, arguments=[dataset_input]))

Kindly update the result after checking.

Upvotes: 1

Related Questions