Reputation: 7630
I am trying work on bring your own model. I have R code. when i try to run the job its failing.
Training Image:
FROM r-base:3.6.3
MAINTAINER Amazon SageMaker Examples <[email protected]>
RUN apt-get -y update && apt-get install -y --no-install-recommends \
wget \
r-base \
r-base-dev \
apt-transport-https \
ca-certificates \
python3 python3-dev pip
ENV AWS_DEFAULT_REGION="us-east-2"
RUN R -e "install.packages('reticulate', dependencies = TRUE, warning = function(w) stop(w))"
RUN R -e "install.packages('readr', dependencies = TRUE, warning = function(w) stop(w))"
RUN R -e "install.packages('dplyr', dependencies = TRUE, warning = function(w) stop(w))"
RUN pip install --quiet --no-cache-dir \
'boto3>1.0<2.0' \
'sagemaker>2.0<3.0'
ENTRYPOINT ["/usr/bin/Rscript"]
Source code:
rcode
└── train.R
└── train.tar.gz
Build
- aws s3 cp $CODEBUILD_SRC_DIR/rcode/ s3://${self:custom.deploymentBucket}/${self:service}/code/training --recursive
Serverless.com yaml
SagemakerRCodeTrainingStep:
Type: Task
Resource: ${self:custom.sageMakerTrainingJob}
Parameters:
TrainingJobName.$: "$.sageMakerTrainingJobName"
DebugHookConfig:
S3OutputPath: "s3://${self:custom.deploymentBucket}/${self:service}/models/rmodel"
AlgorithmSpecification:
TrainingImage: ${self:custom.sagemakerRExecutionContainerURI}
TrainingInputMode: "File"
OutputDataConfig:
S3OutputPath: "s3://${self:custom.deploymentBucket}/${self:service}/models/rmodel"
StoppingCondition:
MaxRuntimeInSeconds: ${self:custom.maxRuntime}
ResourceConfig:
InstanceCount: 1
InstanceType: "ml.m5.xlarge"
VolumeSizeInGB: 30
RoleArn: ${self:custom.stateMachineRoleARN}
InputDataConfig:
- DataSource:
S3DataSource:
S3DataType: "S3Prefix"
S3Uri: "s3://${self:custom.datasetsFilePath}/data/processed/train"
S3DataDistributionType: "FullyReplicated"
ChannelName: "train"
HyperParameters:
sagemaker_submit_directory: "s3://${self:custom.deploymentBucket}/${self:service}/code/training/train.tar.gz"
sagemaker_program: "train.R"
sagemaker_enable_cloudwatch_metrics: "false"
sagemaker_container_log_level: "20"
sagemaker_job_name: "sagemaker-r-learn-2022-02-28-09-56-33-234"
sagemaker_region: ${self:provider.region}
Upvotes: 1
Views: 2108
Reputation: 1
If you want the ability to execute an arbitrary R script inside your container, you will need to write an entrypoint R script that takes uses the arguments Sagemaker passes in. The Amazon Sagemaker example repo cover this here. Using the entrypoint
parameter of the Sagemaker SDK estimator class, the name of your script will passed as an argument in the run command (e.g. docker run image train script
).
Note that the entrypoint
argument to the estimator class is not overriding the image's entrypoint as you might expect based on the name. It is only adding an argument to the docker run
command.
Upvotes: 0
Reputation: 1314
I am not sure which TrainingImage
you are using and all the files in your container.
That being said, I suspect you are using a custom container.
SageMaker Training Jobs look for a train
file and run your container as follows:
docker run image train
You can change this behavior by setting the ENTRYPOINT
in your Dockerfile. Please see this example Dockerfile from the r_byo_r_algo_hpo example.
Upvotes: 2