Kravitz
Kravitz

Reputation: 2859

Google APP Engine - spawns new instance for every connection or has zero instances

I am noticing something a little odd with Google App Engine. If my app has not been used and I go open it I notice that it takes some time to load, I also see in the GAE logs console that it is starting up a server during this time so that accounts for the wait (why not always have an instance running?)

After I open and close the app a couple of times I then notice in the versions tab of GAE that I have 7 running instances (all in the same version).

Im a little confused how GAE works, does it roll down your instances to 0 when there is no requests for a while and then on the flip side, does it spin up a new instance for every new client connecting ?

my app.yaml is looking like this:

  runtime: nodejs10
  env: standard

  instance_class: F2

  handlers:
    - url: /.*
      secure: always
      redirect_http_response_code: 301
      script: auto    

Upvotes: 1

Views: 1141

Answers (2)

Jan Hernandez
Jan Hernandez

Reputation: 4620

You need to fine tune your App Engine scaling strategy, for example please check this app.yaml file

runtime: nodejs10
env: standard
instance_class: F2
handlers:
  - url: /.*
    secure: always
    redirect_http_response_code: 301
    script: auto

automatic_scaling:
  min_instances: 1
  max_instances: 4
  min_idle_instances: 1
  max_concurrent_requests: 25
  target_throughput_utilization: 0.8

inbound_services:
- warmup

min_instances & min_idle_instances are set to 1 in order to have almost 1 instance ready for incoming requests and avoid cold start.

To avoid spin up new instances too fast, you can set max_concurrent_requests & target_throughput_utilization, in this example a new instance will be spin up until an instance reaches 20 concurrent requests (25 X 0.8)

As is mentioned in this document, it is necessary create a warmup endpoint in your application and add inbound_services in your app.yaml file, for example:

app.get('/_ah/warmup', (req, res) => {
    // Handle your warmup logic. Initiate db connection, etc.
});

warmup calls carry the benefit of prepare your instances before an incoming request and reduce the latency of first request.

Upvotes: 4

bhito
bhito

Reputation: 2673

As you did not specify any scaling setting in your app.yaml, App Engine is using automatic scaling.

That means that the application has 0 minimum instances so when your app is not receiving any request at all it will scale down to 0. With that option you will sabve the costs that imply having an instance running all the time, but also cold starts will happen. A cold start happens each time a request reaches your application but there are no instances ready to serve it and a new one has to be created.

Regarding your application scaling up to 7 instances when the traffic load increases, it depends again on the workload that is receiving. You can control this behaviour as well by using the max_instances setting, although using a low value could affect your application's performance if more instances are needed.

App Engine will be spinning up new instances if the threshold value on target_cpu_utilization, target_throughput_utilization , max_concurrent_requests, max_pending_latency or min_pending_latency is reached. You can read about all of them here.

Upvotes: 1

Related Questions