Reputation: 677
I have deployed a FastAPI ML service to Google App Engine but it's exhibiting some odd behavior. The FastAPI service is intended to receive requests from a main service (via Cloud Tasks) and then send responses back. And that does happen. But it appears the route in the FastAPI service that handles these requests gets called four times instead of just once.
My assumption was that GAE, gunicorn, or FastAPI would ensure that the handler runs once per cloud task. But it appears that multiple workers, or some other issue in my config, is causing the handler to get called four times. Here are a few more details and some specific questions:
gcloud app deploy app.yaml
app.yaml
file includes GUNICORN_ARGS: "--graceful-timeout 3540 --timeout 3600 -k gevent -c gunicorn.gcloud.conf.py main:app"
Dockerfile
in the FastAPI project root (which is used for the gcloud deploy) also includes the final command gunicorn -c gunicorn.gcloud.conf.py main:app
Here's the gunicorn conf:
bind = ":" + os.environ["PORT"]
workers = multiprocessing.cpu_count() * 2 + 1
worker_class = "uvicorn.workers.UvicornWorker"
forwarded_allow_ips = "*"
max_requests = 1000
max_requests_jitter = 100
timeout = 200
graceful_timeout = 6000
So I'm confused:
GUNICORN_ARGS
in app.yaml
or the gunicorn
argument in the Dockerfile take precedence?Happy to provide any other relevant info.
Upvotes: 1
Views: 550
Reputation: 136
GAE Flex defines environment variables in the app.yaml file [1]. Looking at Docker Compose "In the case of environment, labels, volumes, and devices, Compose “merges” entries together with locally-defined values taking precedence." [2], depending on if they are using a .env file "Values in the shell take precedence over those specified in the .env file." [3]
[1] https://cloud.google.com/appengine/docs/flexible/custom-runtimes/configuring-your-app-with-app-yaml#defining_environment_variables [2] https://docs.docker.com/compose/extends/ [3] https://docs.docker.com/compose/environment-variables/
The issue is unlikely to be a Cloud Task duplication issue "in production, more than 99.999% of tasks are executed only once." [4]. You can investigate the calling source
[4] https://cloud.google.com/tasks/docs/common-pitfalls#duplicate_execution
You can also investigate the log contents to see if there are unique identifiers, or if they are the same logs.
For the second question on uvicorn [0] workers, you can try hard coding the value of “workers” to 1 and verify if there is no repetition.
Upvotes: 2