max_num_instances for Google App Engine Standard Environment

Question

Currently, I'm running an application using Google App Engine Standard Environment (Python)

With only 1 or 2 hours before the free quota 24 hours daily reset, I will use up all the 28 instance hours

My traffic pattern is low most of the time, except it will be high around 8 hours during night-time.

My app.yaml pretty much fall back to all default settings

application: my-webapp
version: 1
runtime: python27
api_version: 1
threadsafe: false

I would still like to rely on automatic_scaling. I wish to downgrade my app engine performance a bit, in exchange of no daily charge.

In Flexible Environment, I realize there's a config where we can specific

https://cloud.google.com/appengine/docs/flexible/python/configuring-your-app-with-app-yaml

automatic_scaling:
  min_num_instances: 1
  max_num_instances: 1

I would like to limit the maximum number of instance, in my App Engine Standard Environment. However, I don't find max_num_instances config in Standard Environment.

https://cloud.google.com/appengine/docs/standard/python/config/appref#scaling_elements

What I find the valid config under Standard Environment's automatic_scaling are

max_concurrent_requests
max_idle_instances
max_pending_latency
min_idle_instances
min_pending_latency

I would like to utilize all 28 instance hours, slightly downgrade on performance, yet with no daily charge incurred :)

May I know, which config parameter I should start to fine tune to?

Update

I had tried

automatic_scaling:
  max_idle_instances: 1
  min_idle_instances: 0
  max_concurrent_requests: 80

However, it seems to make thing worst.

5 instances are created but none of them are active?!

Right now, I only served 2k request, but 16.8 instance hours already consumed

Compared to my another app, which serves higher traffic (But lower latency). It is always having 1 instance only. So far, only 8.43 instance hours consumed

I didn't have any special parameter in my higher traffic app yaml file. Hence, I'm not sure why there's difference in their number of spawned instances.

Dan Cornilescu · Accepted Answer

If your app supports it (not all apps do! - it depends on how they're coded) try setting threadsafe: true, allowing one instance to serve multiple requests in parallel, which would reduce the overall request latency, thus helping the GAE autoscaler decide to launch fewer instances. If this works you can also try tweaking the related max_concurrent_requests.

Another thing to try would be to inform the autoscaler that your app can tolerate higher request latencies via min_pending_latency and max_pending_latency. Related:

Set max_idle_instances to 1 (or even 0, if you're able to) to prevent the autoscaler from starting idle instances that aren't actually helping much with serving traffic. See What does setting the automatic_scaling max_idle_instances to zero (0) do?

Finally, if you really want to put a hard cap on your instance number you can switch to basic scaling, which does have a max_instances configuration option. But be aware that this can seriously degrade user experience if you suddenly have a high request load.

max_num_instances for Google App Engine Standard Environment

Update

Answers (1)

Related Questions