Reputation: 179
So the problem I had was that an app-engine instance which was running a flask API, was stuck in a loop of endless worker restarts and was unresponsive the entire time, which prompted app engine to scale up and add instances (up to 20!).
The flask API served multiple machine learning models, which had to be loaded in one-by-one. Loading in one of these models apparently took very long and caused the worker to be terminated. The logs essentially showed this:
A 2020-03-20T14:42:23Z [2020-03-20 14:42:23 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:2952)
A 2020-03-20T14:42:23Z [2020-03-20 14:42:23 +0000] [2952] [INFO] Worker exiting (pid: 2952)
A 2020-03-20T14:42:24Z [2020-03-20 14:42:24 +0000] [2975] [INFO] Booting worker with pid: 2975
Changing these settings in the app.yaml had no effect, as they are on a higher level:
liveness_check:
initial_delay_sec: 300
check_interval_sec: 30
timeout_sec: 4
failure_threshold: 4
success_threshold: 2
readiness_check:
check_interval_sec: 5
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
app_start_timeout_sec: 300
Upvotes: 2
Views: 476
Reputation: 21520
You should set --timeout 0
for infinite timeouts.
The gunicorn arbiter gets confused when App Engine scales down instances and thinks workers have timed out.
App Engine has its own supervisor which oversees timeouts (with a much longer timeout period), so it's not necessary for Gunicorn to handle worker timeouts.
Upvotes: 1
Reputation: 179
After a quick google it seemed much more likely that the timeouts were gunicorn workers running off into the mist. I found these docs that allowed me to set the timeout time in seconds.
Lo and behold. In my app.yaml file I added the -t 75
and was able to fix the problem. Turns out that one of the older model - a big Naive Bayes classifier - was taking around 50s to even load.
My app.yaml:
entrypoint: gunicorn -b :$PORT main:app -t 75
I saw that there were some people running flask APIs on app engine that also encountered this problem in some variation, so I figured I'd provide this extra breadcrumb.
Upvotes: 0