Reputation: 365
AWS Sagemaker model deployment is failing when endpoint_name argument is specified. Any thoughts?
Without endpoint_name argument in deploy, model deployment works successfully. Model training and saving into S3 location is successful either way.
import boto3
import os
import sagemaker
from sagemaker import get_execution_role
from sagemaker.predictor import csv_serializer
from sagemaker.amazon.amazon_estimator import get_image_uri
bucket = 'Y'
prefix = 'Z'
role = get_execution_role()
train_data, validation_data, test_data = np.split(df.sample(frac=1, random_state=100), [int(0.5 * len(df)), int(0.8 * len(df))])
train_data.to_csv('train.csv', index=False, header=False)
validation_data.to_csv('validation.csv', index=False, header=False)
test_data.to_csv('test.csv', index=False)
boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'train/X/train.csv')).upload_file('train.csv')
boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'validation/X/validation.csv')).upload_file('validation.csv')
container = get_image_uri(boto3.Session().region_name, 'xgboost')
#print(container)
s3_input_train = sagemaker.s3_input(s3_data='s3://{}/{}/train/{}'.format(bucket, prefix, suffix), content_type='csv')
s3_input_validation = sagemaker.s3_input(s3_data='s3://{}/{}/validation/{}/'.format(bucket, prefix, suffix), content_type='csv')
sess = sagemaker.Session()
output_loc = 's3://{}/{}/output'.format(bucket, prefix)
xgb = sagemaker.estimator.Estimator(container,
role,
train_instance_count=1,
train_instance_type='ml.m4.xlarge',
output_path=output_loc,
sagemaker_session=sess,
base_job_name='X')
#print('Model output to: {}'.format(output_location))
xgb.set_hyperparameters(eta=0.5,
objective='reg:linear',
eval_metric='rmse',
max_depth=3,
min_child_weight=1,
gamma=0,
early_stopping_rounds=10,
subsample=0.8,
colsample_bytree=0.8,
num_round=1000)
#Model fitting
xgb.fit({'train': s3_input_train, 'validation': s3_input_validation})
#Deploy model with automatic endpoint created
xgb_predictor_X = xgb.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge', endpoint_name='X')
xgb_predictor_X.content_type = 'text/csv'
xgb_predictor_X.serializer = csv_serializer
xgb_predictor_X.deserializer = None
INFO:sagemaker:Creating endpoint with name delaymins ClientError: An error occurred (ValidationException) when calling the CreateEndpoint operation: Could not find model "arn:aws:sagemaker:us-west-2::model/X-2019-01-08-18-17-42-158".
Upvotes: 1
Views: 1910
Reputation: 365
Figured it out! If custom endpoint name is not ended before redeploying it, it get blacklisted(not sure if this is temporary). Therefore a different endpoint name must be used if this mistake is made. Moral of the story: Always end an endpoint before redeploying.
Upvotes: 2