Reputation: 2866
When setting up a cloud run, I am worried about how many memory and vCPU should be set each time per server instance.
I use Cloud Run for mobile apps.
I am confused about when to increase vCPU and memory instead of increasing server instances, and when to increase server instances instead of vCPU and memory.
How should I calculate it?
Upvotes: 1
Views: 3905
Reputation: 75715
There isn't a good answer to that question. You have to know the limits:
Then it's a matter of tradeoff, and it's highly dependent of your use case.
Optimize what you prefer, find the right balance. No magic formula!!
Upvotes: 3
Reputation: 528
You can simulate a load of requests in your current settings using k6.io, check the memory and cpu percentage of your container and adjust them to a lower or higher setting to see if you can get more RPS out of a single container.
Once you are satisfied with a single container instance's let's say 100 rps per container instance, you can then specify using gcloud
the flags --min-instances
and --max-instances
depending of course on the --concurrency
flag, which in my explanation would be set to 100.
Also note that it starts at the default of 80 and can go up to 1000.
More info about this can be read on the links below: https://cloud.google.com/run/docs/about-concurrency https://cloud.google.com/sdk/gcloud/reference/run/deploy
I would also recommend you investigating if you need to pass the --cpu-throttling
flag or the --no-cpu-throttling
depending on your need for adjusting for cold starts.
Upvotes: 1