Reputation: 993
When I try to run sagemaker locally for tensorflow in script mode. It seems like I cannot pull the docker container. I have ran the code below from a sagemaker notebook instance and everything ran fine. But when running it on my machine it doesn't work.
How can I download the container, so I can debug things locally?
import os
import sagemaker
from sagemaker.tensorflow import TensorFlow
hyperparameters = {}
role = 'arn:aws:iam::xxxxxxxx:role/yyyyyyy'
estimator = TensorFlow(
entry_point='train.py',
source_dir='.',
train_instance_type='local',
train_instance_count=1,
hyperparameters=hyperparameters,
role=role,
py_version='py3',
framework_version='1.12.0',
script_mode=True)
estimator.fit()
I get this output
INFO:sagemaker:Creating training-job with name: sagemaker-tensorflow-
scriptmode-2019-01-28-18-51-57-787
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
Error response from daemon: pull access denied for 520713654638.dkr.ecr.eu-west-2.amazonaws.com/sagemaker-tensorflow-scriptmode, repository does not exist or may require 'docker login'
subprocess.CalledProcessError: Command 'docker pull 520713654638.dkr.ecr.eu-west-2.amazonaws.com/sagemaker-tensorflow-scriptmode:1.12.0-cpu-py3' returned non-zero exit status 1.
The warning looks like the output you get when using the docker login stuff here. If I follow these steps to register to the directory with tensorflow container it says login success
Invoke-Expression -Command (aws ecr get-login --no-include-email --registry-ids 520713654638 --region eu-west-2)
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
Login Succeeded
But then I still cannot pull it
docker pull 520713654638.dkr.ecr.eu-west-2.amazonaws.com/sagemaker-tensorflow-scriptmode:1.11.0-cpu-py3
Error response from daemon: pull access denied for 520713654638.dkr.ecr.eu-west-2.amazonaws.com/sagemaker-tensorflow-scriptmode, repository does not exist or may require 'docker login'
Upvotes: 3
Views: 2948
Reputation: 2719
the same sequence works for me locally : 'aws ecr get-login', 'docker login', 'docker pull'.
Does your local IAM user have sufficient credentials to pull from ECR? The 'AmazonEC2ContainerRegistryReadOnly' policy should be enough: https://docs.aws.amazon.com/AmazonECR/latest/userguide/ecr_managed_policies.html
Alternatively, you can grab the container from Github and build it: https://github.com/aws/sagemaker-tensorflow-container
Upvotes: 2