dontknowhy
dontknowhy

Reputation: 2866

Cloud Run, ideal vCPU and memory amount per instance?

When setting up a cloud run, I am worried about how many memory and vCPU should be set each time per server instance.

I use Cloud Run for mobile apps.

I am confused about when to increase vCPU and memory instead of increasing server instances, and when to increase server instances instead of vCPU and memory.

How should I calculate it?

Upvotes: 1

Views: 3905

Answers (2)

guillaume blaquiere
guillaume blaquiere

Reputation: 75715

There isn't a good answer to that question. You have to know the limits:

  • The max number of concurrent requests that you can handle concurrently with 4cpu or/and 32Gb of memory (up to 1000 concurrent requests)
  • The max number on instance on Cloud Run (1000)

Then it's a matter of tradeoff, and it's highly dependent of your use case.

  • Bigger instances reduce the number of cold starts (and so high latency when your service scale up). But, if you have only 1 request at a time, you will pay a BIG instance for a small processing
  • Smaller instances allow you to optimize cost and to add only a small slice of resource in your cluster, but you will have to spawn often a new instance and you will have several cold start to endure.

Optimize what you prefer, find the right balance. No magic formula!!

Upvotes: 3

Ernani Joppert
Ernani Joppert

Reputation: 528

You can simulate a load of requests in your current settings using k6.io, check the memory and cpu percentage of your container and adjust them to a lower or higher setting to see if you can get more RPS out of a single container.

Once you are satisfied with a single container instance's let's say 100 rps per container instance, you can then specify using gcloud the flags --min-instances and --max-instances depending of course on the --concurrency flag, which in my explanation would be set to 100.

Also note that it starts at the default of 80 and can go up to 1000.

More info about this can be read on the links below: https://cloud.google.com/run/docs/about-concurrency https://cloud.google.com/sdk/gcloud/reference/run/deploy

I would also recommend you investigating if you need to pass the --cpu-throttling flag or the --no-cpu-throttling depending on your need for adjusting for cold starts.

Upvotes: 1

Related Questions