Green 绿色
Green 绿色

Reputation: 2886

Gunicorn: multiple background worker threads

Setup: My application uses multiple workers to process elements in parallel. Processing those elements is CPU-intensive, so I need worker processes. The application will be used via a Flask API and GUnicorn. GUnicorn itself has multiple worker processes to process requests in parallel. In the Flask API, the request data is put onto a queue and the worker processes of my background application take this data from that queue.

Problem: Forking worker processes is quite time intensive and the application has to meet a certain speed requirement. Therefore, I would like to spawn the background worker processes when the app starts. To avoid mixing results, I need n background worker processes for every GUnicorn worker.

Question: How can I determine during construction time how many background workers I have to spawn and how is it possible to link those workers to GUnicorn workers?

Approach: I can read the number of GUnicorn workers from gunicorn_config.py by importing the workers variable. However, at this point, I do not know the GUnicorn worker process IDs. Do they have internal IDs that I could use at this point (e.g., GUnicorn worker #1, ...)?

Upvotes: 3

Views: 3219

Answers (1)

You need to be aware (and account for) that a gunicorn worker can be stopped at any time (for example due to crash of the request hander or timeout in processing). So this means that hardwiring of your particular worker to particular gunicorn process cannot be permanent.

If you want to link your worker to particular gunicorn worker then relinking should happen every time gunicorn worker is restarted.

One approach would be to define your own post_fork handler and/or port_fork_init that would do that wiring.

Initially you can start the required number of workers and then post_fork handler may "borrow" (or as you call it "link") them to a worker that just have been created.

Upvotes: 3

Related Questions