Cthulhujr
Cthulhujr

Reputation: 433

gUnicorn with systemd Watchdog

We have a requirement to monitor and try to restart our gUnicorn/Django app if it goes down. We're using gunicorn 20.0.4.

I have the following nrs.service running fine with systemd. I'm trying to figure out if it's possible to integrate systemd's watchdog capabilities with gUnicorn. Looking through the source I don't see anywhere a sd_notify("WATCHDOG=1") is being called so I'm thinking that no, gunicorn doesn't know how to keep systemd aware that it's up (it calls sd_notify("READY=1...") at startup but in its run loop there's no signal being sent saying it's still running)

Here's the nrs.service file. I have commented out the watchdog vars because it obviously sends my service into a failed state shortly after it starts.

[Unit]
Description=Gunicorn instance to serve NRS project
After=network.target

[Service]
WorkingDirectory=/etc/nrs
Environment="PATH=/etc/nrs/bin"
ExecStart=/etc/nrs/bin/gunicorn --error-logfile /etc/nrs/logs/gunicorn_error.log --certfile=/etc/httpd/https_certificate/nrs.cer --keyfile=/etc/httpd/https_certificate/server.key --access-logfile /etc/nrs/logs/gunicorn_access.log --capture-output --bind=nrshost:8800 anomalyalerts.wsgi
#WatchdogSec=15s
#Restart=on-failure
#StartLimitInterval=1min
#StartLimitBurst=4

[Install]
WantedBy=multi-user.target

So systemd watchdog is doing its thing, just looks like out of the box gunicorn doesn't support it. Not very familiar with 'monkey-patching' but I'm thinking if we want to use this method of monitoring, I'm going to have to implement some custom code? Other thought was just to have a watch command check the service and try to restart it, which might be easier.

Thanks Jason

Upvotes: 2

Views: 773

Answers (1)

wombatonfire
wombatonfire

Reputation: 5420

monitor and try to restart our gUnicorn/Django app if it goes down

systemd's watchdog will not help in the described case. The reason is that the the watchdog is intended to monitor the main service process, which does not run your app directly.

The Gunicorn's master process, which is the main service process from the systemd's perspective, is a loop that manages the worker processes. Your app is running inside the worker process, so if anything happens there, the worker process is the one that should be restarted, not the master process.

Worker processes' restart is handled by Gunicorn automatically (see timeout setting). As for the main service process, in a rare case when it dies, the Restart=on-failure option can restart it even without a watchdog (see the docs for details on how it behaves).

Upvotes: 1

Related Questions