Reputation: 42870
Currently, I'm running an application using Google App Engine Standard Environment (Python)
With only 1 or 2 hours before the free quota 24 hours daily reset, I will use up all the 28 instance hours
My traffic pattern is low most of the time, except it will be high around 8 hours during night-time.
My app.yaml
pretty much fall back to all default settings
application: my-webapp
version: 1
runtime: python27
api_version: 1
threadsafe: false
I would still like to rely on automatic_scaling
. I wish to downgrade my app engine performance a bit, in exchange of no daily charge.
In Flexible Environment, I realize there's a config where we can specific
https://cloud.google.com/appengine/docs/flexible/python/configuring-your-app-with-app-yaml
automatic_scaling:
min_num_instances: 1
max_num_instances: 1
I would like to limit the maximum number of instance, in my App Engine Standard Environment. However, I don't find max_num_instances
config in Standard Environment.
https://cloud.google.com/appengine/docs/standard/python/config/appref#scaling_elements
What I find the valid config under Standard Environment's automatic_scaling
are
I would like to utilize all 28 instance hours, slightly downgrade on performance, yet with no daily charge incurred :)
May I know, which config parameter I should start to fine tune to?
I had tried
automatic_scaling:
max_idle_instances: 1
min_idle_instances: 0
max_concurrent_requests: 80
However, it seems to make thing worst.
5 instances are created but none of them are active?!
Right now, I only served 2k request, but 16.8 instance hours already consumed
Compared to my another app, which serves higher traffic (But lower latency). It is always having 1 instance only. So far, only 8.43 instance hours consumed
I didn't have any special parameter in my higher traffic app yaml file. Hence, I'm not sure why there's difference in their number of spawned instances.
Upvotes: 1
Views: 1848
Reputation: 39834
If your app supports it (not all apps do! - it depends on how they're coded) try setting threadsafe: true
, allowing one instance to serve multiple requests in parallel, which would reduce the overall request latency, thus helping the GAE autoscaler decide to launch fewer instances. If this works you can also try tweaking the related max_concurrent_requests
.
Another thing to try would be to inform the autoscaler that your app can tolerate higher request latencies via min_pending_latency
and max_pending_latency
. Related:
Set max_idle_instances
to 1 (or even 0, if you're able to) to prevent the autoscaler from starting idle instances that aren't actually helping much with serving traffic. See What does setting the automatic_scaling max_idle_instances to zero (0) do?
Finally, if you really want to put a hard cap on your instance number you can switch to basic scaling, which does have a max_instances
configuration option. But be aware that this can seriously degrade user experience if you suddenly have a high request load.
Upvotes: 1