Reputation: 21
I am not able to save model artifacts in S3 bucket using below code. I am successfully able to save the result in output data path and training job is getting completed successfully. I am using the following piece of code.
Can anyone please confirm how do we save the model_artifacts in model-dir using below code.
# train.py code
#!/usr/bin/env python
from __future__ import print_function
import os
import sys
import pandas as pd
prefix = '/opt/ml/'
input_dir = prefix + 'input/data'
output_data_dir = os.path.join(prefix, 'output/data')
model_dir = os.path.join(prefix, 'model')
channel_name='training'
training_path = os.path.join(input_dir, channel_name)
# The function to execute the training.
def train():
print('Starting the training.')
# Take the set of files and read them all into a single pandas dataframe
input_files = [ os.path.join(training_path, file) for file in os.listdir(training_path) ]
raw_data = [ pd.read_csv(file, header=None) for file in input_files ]
input_data = pd.concat(raw_data)
print(pd.DataFrame(input_data))
output_data = input_data.to_csv(os.path.join(output_data_dir, 'output.csv'), header=False, index=False)
if __name__ == '__main__':
train()
# Below are the S3 input and output paths :
output_path = "s3://{}/{}".format(bucket, prefix_output)
S3_input = "s3://{}/{}".format(bucket, prefix)
#Estimator Code
test_estimator = sagemaker.estimator.Estimator(ecr_image, # ECR image arn,
role=role, # execution role
instance_count=1, # no. of sagemaker instances
instance_type='ml.m4.xlarge', # instance type
output_path=output_path, # output path to store model outputs
base_job_name='sagemaker-job1', # job name prefix
sagemaker_session=session # session
)
# Launch instance and start training
test_estimator.fit({'training':S3_input})
What is missing in this code?
Upvotes: 2
Views: 7758
Reputation: 2765
Sagemaker save automatically to output_path
everything that is inside your model directory, so everything that is in /opt/ml/model
. If the training job complete successfully, at the end Sagemaker takes everything in that folder, create a model.tar.gz
and upload to your output_path
in a folder with the same name of your training job (sagemaker create this folder).
You can also use the environmental variable SM_OUTPUT_DATA_DIR
, which by default points to /opt/ml/output/data
and put non-model training artifacts (e.g. evaluation results), Sagemaker will create an archive from that folder named output.tar.gz
and will upload it in the same folder of the model.tar.gz
on S3.
I don't understand exactly what you mean with "result", but whatever you want to put in that archive, it's up to you to save it in your model_dir
.
So for example this how I save my model in both json and H5, the first will be in the output.tar.gz
archive, the latter in the model.tar.gz
output_artifacts = os.environ.get('SM_OUTPUT_DATA_DIR')
with open(os.path.join(output_artifacts,"model.json"), "w") as json_file:
json_file.write(model_json)
model_directory = os.environ.get('SM_MODEL_DIR')
model.save(os.path.join(model_directory, 'model.h5'))
Upvotes: 5