DSexplorer
DSexplorer

Reputation: 365

Sagemaker model deployment failing due to custom endpoint name

AWS Sagemaker model deployment is failing when endpoint_name argument is specified. Any thoughts?

Without endpoint_name argument in deploy, model deployment works successfully. Model training and saving into S3 location is successful either way.

import boto3
import os
import sagemaker
from sagemaker import get_execution_role
from sagemaker.predictor import csv_serializer
from sagemaker.amazon.amazon_estimator import get_image_uri

bucket = 'Y'
prefix = 'Z'

role = get_execution_role()

    train_data, validation_data, test_data = np.split(df.sample(frac=1, random_state=100), [int(0.5 * len(df)), int(0.8 * len(df))])

    train_data.to_csv('train.csv', index=False, header=False)
    validation_data.to_csv('validation.csv', index=False, header=False)
    test_data.to_csv('test.csv', index=False)
    boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'train/X/train.csv')).upload_file('train.csv')
    boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'validation/X/validation.csv')).upload_file('validation.csv')

    container = get_image_uri(boto3.Session().region_name, 'xgboost')
    #print(container)

    s3_input_train = sagemaker.s3_input(s3_data='s3://{}/{}/train/{}'.format(bucket, prefix, suffix), content_type='csv')
    s3_input_validation = sagemaker.s3_input(s3_data='s3://{}/{}/validation/{}/'.format(bucket, prefix, suffix), content_type='csv')

    sess = sagemaker.Session()

    output_loc = 's3://{}/{}/output'.format(bucket, prefix)
    xgb = sagemaker.estimator.Estimator(container,
                                        role, 
                                        train_instance_count=1, 
                                        train_instance_type='ml.m4.xlarge',
                                        output_path=output_loc,
                                        sagemaker_session=sess,
                                        base_job_name='X')
    #print('Model output to: {}'.format(output_location))

    xgb.set_hyperparameters(eta=0.5,
                            objective='reg:linear',
                            eval_metric='rmse',
                            max_depth=3,
                            min_child_weight=1,
                            gamma=0,
                            early_stopping_rounds=10,
                            subsample=0.8,
                            colsample_bytree=0.8,
                            num_round=1000)

    #Model fitting
    xgb.fit({'train': s3_input_train, 'validation': s3_input_validation})

    #Deploy model with automatic endpoint created
    xgb_predictor_X = xgb.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge', endpoint_name='X')

    xgb_predictor_X.content_type = 'text/csv'
    xgb_predictor_X.serializer = csv_serializer
    xgb_predictor_X.deserializer = None

INFO:sagemaker:Creating endpoint with name delaymins ClientError: An error occurred (ValidationException) when calling the CreateEndpoint operation: Could not find model "arn:aws:sagemaker:us-west-2::model/X-2019-01-08-18-17-42-158".

Upvotes: 1

Views: 1910

Answers (1)

DSexplorer
DSexplorer

Reputation: 365

Figured it out! If custom endpoint name is not ended before redeploying it, it get blacklisted(not sure if this is temporary). Therefore a different endpoint name must be used if this mistake is made. Moral of the story: Always end an endpoint before redeploying.

Upvotes: 2

Related Questions