Frank63
Frank63

Reputation: 79

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the CreateModel operation: Could not access model data

I want to deploy an MLflow image to an AWS Sagemaker endpoint that contains a machine learning model. I executed the following code, which I found in this blog post.

import mlflow.sagemaker as mfs

run_id = run_id # the model you want to deploy - this run_id was saved when we trained our model
region = "us-east-1" # region of your account
aws_id = "XXXXXXXXXXX" # from the aws-cli output
arn = "arn:aws:iam::XXXXXXXXXXX:role/your-role"
app_name = "iris-rf-1"
model_uri = "mlruns/%s/%s/artifacts/random-forest-model" % (experiment_id,run_id) # edit this path based on your working directory
image_url = aws_id + ".dkr.ecr." + region + ".amazonaws.com/mlflow-pyfunc:1.2.0" # change to your mlflow version

mfs.deploy(app_name=app_name, 
           model_uri=model_uri, 
           region_name=region, 
           mode="create",
           execution_role_arn=arn,
           image_url=image_url)

But I got the following error. I checked all policies and permissions attached to the IAM role. They all comply with what the error message complains about.

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the CreateModel operation: Could not access model data at https://s3.amazonaws.com/mlflow-sagemaker-us-east-1-xxx/mlflow-xgb-demo-model-eqktjeoit5mxhmjn-abpanw/model.tar.gz. Please ensure that the role "arn:aws:iam::xxx:role/mlflow-sagemaker-dev" exists and that its trust relationship policy allows the action "sts:AssumeRole" for the service principal "sagemaker.amazonaws.com". Also ensure that the role has "s3:GetObject" permissions and that the object is located in us-east-1.

How to resolve this?

Upvotes: 0

Views: 4142

Answers (2)

Anjani Dubey
Anjani Dubey

Reputation: 1

Just check the name of stack by visiting the cloudFormation service. The error will be resolved. This error occurs when you name your stack different than the project name. By default, the README.md file refers command having the project name as stack.

Upvotes: 0

Frank63
Frank63

Reputation: 79

I found the root cause. I had to go to "Trust relationship" section of the IAM role and then add "sagemaker.amazonaws.com" to the service principal.

Upvotes: 2

Related Questions