Reputation: 752

Metrics for any step of Sagemaker pipeline (not just TrainingStep)

My understanding is that in order to compare different trials of a pipeline (see image), the metrics can only be obtained from the TrainingStep, using the metric_definitions argument for an Estimator.

In my pipeline, I extract metrics in the evaluation step that follows the training. Is it possible to record there metrics that are then tracked for each trial?

Upvotes: 1

Answers (1)

Giuseppe La Gualano

Reputation: 1720

SageMaker suggests using Property Files and JsonGet for each necessary step. This approach is suitable for using conditional steps within the pipeline, but also trivially for persisting results somewhere.

from sagemaker.workflow.properties import PropertyFile
from sagemaker.workflow.steps import ProcessingStep

evaluation_report = PropertyFile(
    name="EvaluationReport",
    output_name="evaluation",
    path="evaluation.json"
)

step_eval = ProcessingStep(
     # ...
     property_files=[evaluation_report]
)

and in your processor script:

import json

report_dict = {}  # your report
evaluation_path = "/opt/ml/processing/evaluation/evaluation.json"

with open(evaluation_path, "w") as f:
    f.write(json.dumps(report_dict))

You can read this file in the pipeline steps directly with JsonGet.

Upvotes: 1

Metrics for any step of Sagemaker pipeline (not just TrainingStep)

Answers (1)

Related Questions