Reputation: 10647
I want to handle dyno restarts on Heroku according to their description here:
During this time they should stop accepting new requests or jobs and eattempt to finish their current requests, or put jobs back on the queue for other worker processes to handle.
From the looks of it, when python receives SIGTERM and the signal handler is called (per signal.signal
), the current thread running is stopped, so the request is stopped in the middle of running.
How do I meet both requirements? (stop accepting new requests + finish the current requests)
Upvotes: 5
Views: 885
Reputation: 13619
EDIT: Added simplified example code, explained ongoing requests/termination better and added gist from CrazyPython.
On the face of it, You have 4 problems to solve. I'll take them in turn and then give some sample code that should help clarify:
Handling SIGTERM
This is simple. You just need to set up a signal handler to note that you need to shut down. PMOTW has a good set of examples of how to catch the signal. You could use variants of this code to catch SIGTERM and set a global flag that says you are shutting down.
Rejecting new requests
Django middleware provides a neat way of hooking any HTTP request to your application. You could create a simple process_request()
hook that returns an error page if the global flag (from above) is set.
Completing existing requests
With any new requests stopped, you now have to let your current requests complete. Although you might not believe it right now, this means you simply do nothing and let the program just carry on running as usual after the SIGTERM. Let me expand on that...
The contract with heroku is that you must complete within 10s of a SIGTERM, or it will send a SIGKILL anyway. That means there is nothing you can do (as a well-behaved application) to ensure that all requests always complete. Consider the 2 cases:
In both cases, therefore, the solution is just to let your program carry on running to let as many current requests complete before terminating.
Terminating your application
The simplest thing to do might be to wait for the SIGKILL to come along from heroku 10 seconds later. It's not elegant, but it should be OK because you are rejecting any new requests.
If that's not good enough, you need to track your outstanding requests and use that to decide when you can close down your application. The exact way to close your application will depend on whatever is hosting it, so I can't give you exact guidance there. Hopefully the sample code gives you enough of a pointer, though.
Sample code
Starting from the signal handler example in PMOTW, I've beefed up the code to add multiple threads processing requests and a termination manager to catch the signal and allow the app to shut down gracefully. You should be able to run this in Python2.7 and then try killing the process.
Building on this example, CrazyPython created this gist to give a concrete implementation in django.
import signal
import os
import time
import threading
import random
class TerminationManager(object):
def __init__(self):
self._running = True
self._requests = 0
self._lock = threading.Lock()
signal.signal(signal.SIGTERM, self._start_shutdown)
def _start_shutdown(self, signum, stack):
print 'Received:', signum
self._running = False
def start_request(self):
with self._lock:
self._requests += 1
def stop_request(self):
with self._lock:
self._requests -= 1
def is_running(self):
return self._running or self._requests > 0
def running_requests(self):
return self._requests
class DummyWorker(threading.Thread):
def __init__(self, app_manager):
super(DummyWorker, self).__init__()
self._manager = app_manager
def run(self):
while self._manager.is_running():
# Emulate random work and delay between requests.
if random.random() > 0.9:
self._manager.start_request()
time.sleep(random.randint(1, 3))
self._manager.stop_request()
else:
time.sleep(1)
print "Stopping worker"
manager = TerminationManager()
print 'My PID is:', os.getpid()
for _ in xrange(10):
t = DummyWorker(manager)
t.start()
while manager.is_running():
print 'Waiting with {} running requests'.format(manager.running_requests())
time.sleep(5)
print 'All done!'
Upvotes: 2