Ben
Ben

Reputation: 993

Getting sagemaker container locally

When I try to run sagemaker locally for tensorflow in script mode. It seems like I cannot pull the docker container. I have ran the code below from a sagemaker notebook instance and everything ran fine. But when running it on my machine it doesn't work.

How can I download the container, so I can debug things locally?

import os

import sagemaker
from sagemaker.tensorflow import TensorFlow


hyperparameters = {}
role = 'arn:aws:iam::xxxxxxxx:role/yyyyyyy'
estimator = TensorFlow(
    entry_point='train.py',
    source_dir='.',
    train_instance_type='local',
    train_instance_count=1,
    hyperparameters=hyperparameters,
    role=role,
    py_version='py3',
    framework_version='1.12.0',
    script_mode=True)

estimator.fit()

I get this output

INFO:sagemaker:Creating training-job with name: sagemaker-tensorflow-
scriptmode-2019-01-28-18-51-57-787
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
Error response from daemon: pull access denied for 520713654638.dkr.ecr.eu-west-2.amazonaws.com/sagemaker-tensorflow-scriptmode, repository does not exist or may require 'docker login'

subprocess.CalledProcessError: Command 'docker pull 520713654638.dkr.ecr.eu-west-2.amazonaws.com/sagemaker-tensorflow-scriptmode:1.12.0-cpu-py3' returned non-zero exit status 1.

The warning looks like the output you get when using the docker login stuff here. If I follow these steps to register to the directory with tensorflow container it says login success

Invoke-Expression -Command (aws ecr get-login --no-include-email --registry-ids 520713654638 --region eu-west-2)
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
Login Succeeded

But then I still cannot pull it

docker pull 520713654638.dkr.ecr.eu-west-2.amazonaws.com/sagemaker-tensorflow-scriptmode:1.11.0-cpu-py3
Error response from daemon: pull access denied for 520713654638.dkr.ecr.eu-west-2.amazonaws.com/sagemaker-tensorflow-scriptmode, repository does not exist or may require 'docker login'

Upvotes: 3

Views: 2948

Answers (1)

Julien Simon
Julien Simon

Reputation: 2719

the same sequence works for me locally : 'aws ecr get-login', 'docker login', 'docker pull'.

Does your local IAM user have sufficient credentials to pull from ECR? The 'AmazonEC2ContainerRegistryReadOnly' policy should be enough: https://docs.aws.amazon.com/AmazonECR/latest/userguide/ecr_managed_policies.html

Alternatively, you can grab the container from Github and build it: https://github.com/aws/sagemaker-tensorflow-container

Upvotes: 2

Related Questions