Reputation: 2859
I am noticing something a little odd with Google App Engine. If my app has not been used and I go open it I notice that it takes some time to load, I also see in the GAE logs console that it is starting up a server during this time so that accounts for the wait (why not always have an instance running?)
After I open and close the app a couple of times I then notice in the versions tab of GAE that I have 7 running instances (all in the same version).
Im a little confused how GAE works, does it roll down your instances to 0 when there is no requests for a while and then on the flip side, does it spin up a new instance for every new client connecting ?
my app.yaml is looking like this:
runtime: nodejs10
env: standard
instance_class: F2
handlers:
- url: /.*
secure: always
redirect_http_response_code: 301
script: auto
Upvotes: 1
Views: 1141
Reputation: 4620
You need to fine tune your App Engine scaling strategy, for example please check this app.yaml
file
runtime: nodejs10
env: standard
instance_class: F2
handlers:
- url: /.*
secure: always
redirect_http_response_code: 301
script: auto
automatic_scaling:
min_instances: 1
max_instances: 4
min_idle_instances: 1
max_concurrent_requests: 25
target_throughput_utilization: 0.8
inbound_services:
- warmup
min_instances
& min_idle_instances
are set to 1 in order to have almost 1 instance ready for incoming requests and avoid cold start.
To avoid spin up new instances too fast, you can set max_concurrent_requests
& target_throughput_utilization
, in this example a new instance will be spin up until an instance reaches 20 concurrent requests (25 X 0.8)
As is mentioned in this document, it is necessary create a warmup endpoint in your application and add inbound_services
in your app.yaml file, for example:
app.get('/_ah/warmup', (req, res) => {
// Handle your warmup logic. Initiate db connection, etc.
});
warmup calls carry the benefit of prepare your instances before an incoming request and reduce the latency of first request.
Upvotes: 4
Reputation: 2673
As you did not specify any scaling setting in your app.yaml
, App Engine is using automatic scaling.
That means that the application has 0 minimum instances so when your app is not receiving any request at all it will scale down to 0. With that option you will sabve the costs that imply having an instance running all the time, but also cold starts will happen. A cold start happens each time a request reaches your application but there are no instances ready to serve it and a new one has to be created.
Regarding your application scaling up to 7 instances when the traffic load increases, it depends again on the workload that is receiving. You can control this behaviour as well by using the max_instances
setting, although using a low value could affect your application's performance if more instances are needed.
App Engine will be spinning up new instances if the threshold value on target_cpu_utilization
, target_throughput_utilization
, max_concurrent_requests
, max_pending_latency
or min_pending_latency
is reached. You can read about all of them here.
Upvotes: 1