Baenka
Baenka

Reputation: 366

Sagemaker endpoint invalid when create_monitoring_schedule is called on the endpoint

I am following this github repo, adopting it to a text classification problem that is built on distil bert. So given a sting of text, the model should return a label and a (probability) score. Output from the model:

sentiment_input = {"inputs": "I love using the new Inference DLC."}

# sentiment_input= "I love using the new Inference DLC."

response = predictor.predict(data=sentiment_input)
print(response)

Output:

[{'label': 'LABEL_80', 'score': 0.008507220074534416}]

When I run the following

# Create an enpointInput
endpointInput = EndpointInput(
    endpoint_name=predictor.endpoint_name,
    probability_attribute="score",
    inference_attribute="label",
#     probability_threshold_attribute=0.5,
    destination="/opt/ml/processing/input_data",
)

# Create the monitoring schedule to execute every hour.
from sagemaker.model_monitor import CronExpressionGenerator

response = clinc_intent0911.create_monitoring_schedule(
    monitor_schedule_name=clincintent_monitor_schedule_name,
    endpoint_input=endpointInput,
    output_s3_uri=baseline_results_uri,
    problem_type="MulticlassClassification",
    ground_truth_input=ground_truth_upload_path,
    constraints=baseline_job.suggested_constraints(),
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,
)

I get the following error:

---------------------------------------------------------------------------
ClientError                               Traceback (most recent call last)
<ipython-input-269-72e7049246fb> in <module>
     10     constraints=baseline_job.suggested_constraints(),
     11     schedule_cron_expression=CronExpressionGenerator.hourly(),
---> 12     enable_cloudwatch_metrics=True,
     13 )

/opt/conda/lib/python3.6/site-packages/sagemaker/model_monitor/model_monitoring.py in create_monitoring_schedule(self, endpoint_input, ground_truth_input, problem_type, record_preprocessor_script, post_analytics_processor_script, output_s3_uri, constraints, monitor_schedule_name, schedule_cron_expression, enable_cloudwatch_metrics)
   2615             network_config=self.network_config,
   2616         )
-> 2617         self.sagemaker_session.sagemaker_client.create_model_quality_job_definition(**request_dict)
   2618 
   2619         # create schedule

/opt/conda/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    355                     "%s() only accepts keyword arguments." % py_operation_name)
    356             # The "self" in this scope is referring to the BaseClient.
--> 357             return self._make_api_call(operation_name, kwargs)
    358 
    359         _api_call.__name__ = str(py_operation_name)

/opt/conda/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    674             error_code = parsed_response.get("Error", {}).get("Code")
    675             error_class = self.exceptions.from_code(error_code)
--> 676             raise error_class(parsed_response, operation_name)
    677         else:
    678             return parsed_response

ClientError: An error occurred (ValidationException) when calling the CreateModelQualityJobDefinition operation: Endpoint 'clinc-intent-analysis-0911' does not exist or is not valid

At this point my sagemaker endpoint is live and unable to debug it is not valid.

Upvotes: 0

Views: 270

Answers (1)

durga_sury
durga_sury

Reputation: 1152

SageMaker ModelMonitor only works for tabular datasets at the moment out of the box (see documentation), and hence the "not valid" error message. To use it on NLP problems, you'd have to bring your own model monitor container (BYOC). Here is an example to get started - https://aws.amazon.com/blogs/machine-learning/detect-nlp-data-drift-using-custom-amazon-sagemaker-model-monitor/, and the associated Github repo is here - https://github.com/aws-samples/detecting-data-drift-in-nlp-using-amazon-sagemaker-custom-model-monitor

Upvotes: 1

Related Questions