Reputation: 9
I have a django app that is served by gunicorn. I use statsd in order to export API's metrics to a prometheus server. My statsd exporter is running on a container whose image is prom/statsd-exporter
. My API and the statsd exporter are running on a swarm and are on the same swarm network so I don't have to deal with public exposure of those metrics.
Sometimes for any reason my statsd exporter container can restart. It can be due to the swarm agent doing his job or something else but I expect this behaviour not to be a problem (except maybe losing a bit of metrics but not many).
What I observe is that when the statsd container restarts, gunicorn stops exporting its metrics and when it is up again gunicorn does not send the metrics again. I have to relaunch the API service to get metrics back. I did not found solutions on the internet to that behaviour. Maybe it is a feature???
I thought about doing a healthcheck on statsd in order to restart my API when statsd is down but if statsd is really down I don't want that my API keeps restarting because of statsd being down. I could set up a long time like 10 minutes between the healthchecks that would not really impact my service but I'm not completely convinced by this solution. I would prefer fniding a way to make gunicorn reconnect to statsd when it is up again if it wents down at a moment.
Thanks for your help.
Upvotes: 0
Views: 98