Reputation: 7349
I'm using the SageMaker TensorFlow estimator for training, and specifying an output path for my model artifacts with the output_path
argument, with a value of s3://<bucket>/<prefix>/
.
After model training, a directory named <training_job_name>/output
is created in the specified output_path
.
The issue I'm having is, the source code that's used for training is also uploaded to S3 by default, but instead of being placed in s3://<bucket>/<prefix>/<training_job_name>/source
, it's placed in s3://<bucket>/<training_job_name>/source
.
So how can I specify the S3 upload path for the training job's source code in order to make it use the bucket AND prefix name of output_path
?
Upvotes: 0
Views: 1223
Reputation: 384
I believe the code_location parameter shown by @user3458797 is the correct answer.
The output_path only configures the S3 location for saving the training result (model artifacts and output files).
https://sagemaker.readthedocs.io/en/stable/estimators.html
Your training script won't be saved in the "output_path" unless you move the file into /opt/ml/model during the training or you use the code_location parameter.
Please let me know if there is anything I can clarify.
Upvotes: 1
Reputation: 81
Have you tried using the “code_location” argument: https://sagemaker.readthedocs.io/en/stable/estimators.html to specify the location for the source code?
Below is a snippet code example that use code_location
from sagemaker.tensorflow import TensorFlow
code-path = "s3://<bucket>/<prefix>"
output-path = "s3://<bucket>/<prefix>"
abalone_estimator = TensorFlow(entry_point='abalone.py',
role=role,
framework_version='1.12.0',
training_steps= 100,
image_name=image,
evaluation_steps= 100,
hyperparameters={'learning_rate': 0.001},
train_instance_count=1,
train_instance_type='ml.c4.xlarge',
code_location= code-path,
output_path = output-path,
base_job_name='my-job-name'
)
Upvotes: 4