Sai
Sai

Reputation: 1

Problem with deploying finetuned gemma in AWS sagemaker as an endpoint


finetuned_gemma/model-00004-of-00004.safetensors

finetuned_gemma/tokenizer_config.json

finetuned_gemma/model.safetensors.index.json

finetuned_gemma/config.json

finetuned_gemma/model-00002-of-00004.safetensors

finetuned_gemma/generation_config.json

finetuned_gemma/special_tokens_map.json

finetuned_gemma/model-00001-of-00004.safetensors

finetuned_gemma/tokenizer.json

finetuned_gemma/code/

finetuned_gemma/code/requirements.txt

finetuned_gemma/code/.ipynb_checkpoints/

finetuned_gemma/code/.ipynb_checkpoints/requirements-checkpoint.txt

finetuned_gemma/code/inference.py

finetuned_gemma/model-00003-of-00004.safetensors


The finetuned model is also stored in aws s3.

How do I now deploy the model as a sagemaker endpoint?

By the way I have used transformers version 4.38.0 as it is the minimum requirement for gemma tokenizer.

I want to know how to deploy it along with the image Uri. Please help

I tried using sagemaker.huggingfacemodel and then tried deploying it but I'm facing lots of difficulties.

Upvotes: 0

Views: 207

Answers (1)

Marc Karp
Marc Karp

Reputation: 1314

You could use the SageMaker Large Model Inference (LMI) container supports gemma models.

https://docs.djl.ai/docs/serving/serving/docs/lmi/deployment_guide/deploying-your-endpoint.html#configuration---servingproperties

Upvotes: 0

Related Questions