Reputation: 628
I'm deploying a serverless NLP app, made using BERT. I'm currently using Serverless Framework and AWS ECR to overcome AWS Lambda deployment package limit of 250 MB (PyTorch already occupies more than that space).
I'm quite happy with this solution as it allows me to simply dockerize my app, upload it to ECR and worry about nothing else.
One doubt I have is where should I store the models. My app uses 3 different saved models, each with a size of 422 MB. I have two options:
Copy my models in the docker image itself.
Store my models in S3:
So my question ultimately is: of the two solutions, which is the best practice? Why, why not? Is there even a best practice at all or is it based on preferences / need?
Upvotes: 1
Views: 384
Reputation: 21510
There is a third option that might be great for you: Store your models on a EFS volume.
EFS volumes are like additional hard drives that you can attach to your Lambda. They can be pretty much as big as you want.
After you trained your model just copy it to your EFS volume. You configure your Lambda to mount that EFS volume when it boots and voila, your model is available without any fuzz. No copying from S3 or putting it in a Docker image. And the same EFS volume can be mounted to more than one Lambda at the same time.
To learn more read:
Update 25.08.2021
User @wtfzamba tried this solution and came across a limitation that might be of interested to others:
I did indeed try the solution you suggested. It works well, but only to a point, and I'm referring to performance. In my situation, I need to be able to spin up ~100 lambdas concurrently when I do batch classification, to speed up the process. The problem is that EFS throughput cap is not PER connection, but in total. So the 300MB/s of burst throughput that I was allowed seemed to be shared by each lambda instance, which at that point timed out even before being able to load the models into memory.
Keep this in mind when you choose this option.
Upvotes: 3