Sagemaker endpoint invalid when create_monitoring_schedule is called on the endpoint

Question

I am following this github repo, adopting it to a text classification problem that is built on distil bert. So given a sting of text, the model should return a label and a (probability) score. Output from the model:

sentiment_input = {"inputs": "I love using the new Inference DLC."}

# sentiment_input= "I love using the new Inference DLC."

response = predictor.predict(data=sentiment_input)
print(response)

Output:

[{'label': 'LABEL_80', 'score': 0.008507220074534416}]

When I run the following

# Create an enpointInput
endpointInput = EndpointInput(
    endpoint_name=predictor.endpoint_name,
    probability_attribute="score",
    inference_attribute="label",
#     probability_threshold_attribute=0.5,
    destination="/opt/ml/processing/input_data",
)

# Create the monitoring schedule to execute every hour.
from sagemaker.model_monitor import CronExpressionGenerator

response = clinc_intent0911.create_monitoring_schedule(
    monitor_schedule_name=clincintent_monitor_schedule_name,
    endpoint_input=endpointInput,
    output_s3_uri=baseline_results_uri,
    problem_type="MulticlassClassification",
    ground_truth_input=ground_truth_upload_path,
    constraints=baseline_job.suggested_constraints(),
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,
)

I get the following error:

---------------------------------------------------------------------------
ClientError                               Traceback (most recent call last)
 in 
     10     constraints=baseline_job.suggested_constraints(),
     11     schedule_cron_expression=CronExpressionGenerator.hourly(),
---> 12     enable_cloudwatch_metrics=True,
     13 )

/opt/conda/lib/python3.6/site-packages/sagemaker/model_monitor/model_monitoring.py in create_monitoring_schedule(self, endpoint_input, ground_truth_input, problem_type, record_preprocessor_script, post_analytics_processor_script, output_s3_uri, constraints, monitor_schedule_name, schedule_cron_expression, enable_cloudwatch_metrics)
   2615             network_config=self.network_config,
   2616         )
-> 2617         self.sagemaker_session.sagemaker_client.create_model_quality_job_definition(**request_dict)
   2618 
   2619         # create schedule

/opt/conda/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    355                     "%s() only accepts keyword arguments." % py_operation_name)
    356             # The "self" in this scope is referring to the BaseClient.
--> 357             return self._make_api_call(operation_name, kwargs)
    358 
    359         _api_call.__name__ = str(py_operation_name)

/opt/conda/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    674             error_code = parsed_response.get("Error", {}).get("Code")
    675             error_class = self.exceptions.from_code(error_code)
--> 676             raise error_class(parsed_response, operation_name)
    677         else:
    678             return parsed_response

ClientError: An error occurred (ValidationException) when calling the CreateModelQualityJobDefinition operation: Endpoint 'clinc-intent-analysis-0911' does not exist or is not valid

At this point my sagemaker endpoint is live and unable to debug it is not valid.

Sagemaker endpoint invalid when create_monitoring_schedule is called on the endpoint

Answers (1)

Related Questions