Load model from s3 to fine-tune using sagemaker training job

Question

I'm trying to finetune a model which is trained using sagemaker training and job and saved to s3. Hugging face is looking into it's repository for the model and throwing an error. is there any way I can read model from s3 path

from sagemaker.huggingface import HuggingFace
batch_size = 16

metric_definitions = [
    {'Name': 'eval_loss', 'Regex': "'eval_loss': ([0-9]+(.|e\-)[0-9]+),?"},
    {'Name': 'loss', 'Regex': "'loss': ([0-9]+(.|e\-)[0-9]+),?"},

]

# hyperparameters, which are passed into the training job
hyperparameters={'epochs': 20,
                 'train_batch_size': batch_size,
                 'model_name': "model",
                 }

# create the Estimator
huggingface_estimator = HuggingFace(
    entry_point='train.py',
    source_dir='./code',
    instance_type='ml.g5.xlarge',
    instance_count=1,
    role=role,
    transformers_version='4.26',
    pytorch_version='1.13',
    py_version='py39',
    metric_definitions=metric_definitions,
    hyperparameters = hyperparameters
)

# starting the train job with our uploaded datasets as input
huggingface_estimator.fit({'train': training_input_path, 'test': val_input_path,
                          "input_model": "s3://sagemaker-us-east-1-18**********/huggingface-pytorch-training-1/output/model.tar.gz"})

I tried loading the model into sagemaker notebook and use it for inference and training I'm able to do it. As sagemaker training job creates separate container, I'm trying to download the files from s3 inside the container using train.py script.

Nikhil Belure · Accepted Answer

Solution: Sagemaker training job creates a container to run training on the model. You can download your model from s3 inside the container extract it and load by specifying the path.

Add the following code to your training script to download the model files inside the container.

import boto3
s3 = boto3.client('s3')
s3.download_file('s3_bucket', 'path_to_model/output/model.tar.gz', 'model.tar.gz')
import tarfile
tar = tarfile.open("model.tar.gz")
tar.extractall("/opt/ml/model/old_model")
tar.close()

once you have the model file locally downloaded into your container you can load tokenizer and model as follows

 tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path = "/opt/ml/model/old_model", local_files_only=True)
 model = AutoModelForQuestionAnswering.from_pretrained(pretrained_model_name_or_path = "/opt/ml/model/old_model", local_files_only=True)

Load model from s3 to fine-tune using sagemaker training job

Answers (1)

Related Questions