Reputation: 585
in my setup I am using Gunicorn for my deployment on a single CPU machine, with three worker process. I have came to ask this question from this answer: https://stackoverflow.com/a/53327191/10268003 . I have experienced that it is taking upto one and a half second to send mail, so I was trying to send email asynchronously. I am trying to understand what will happen to the worker process started by Gunicorn, which will be starting a new thread to send the mail, will the Process gets blocked until the mail sending thread finishes. In that case I beleive my application's throughput will decrease. I did not want to use celery because it seems to be overkill for setting up celery for just sending emails. I am currently running two containers on the same machine with three gunicorn workers each in development machine.
Below is the approach in question, the only difference is i will be using threading for sending mails.
import threading
from .models import Crawl
def startCrawl(request):
task = Crawl()
task.save()
t = threading.Thread(target=doCrawl,args=[task.id])
t.setDaemon(True)
t.start()
return JsonResponse({'id':task.id})
def checkCrawl(request,id):
task = Crawl.objects.get(pk=id)
return JsonResponse({'is_done':task.is_done, result:task.result})
def doCrawl(id):
task = Crawl.objects.get(pk=id)
# Do crawling, etc.
task.result = result
task.is_done = True
task.save()
Upvotes: 0
Views: 3361
Reputation: 125
Assuming that you are using gunicorn Sync (default), Gthread or Async workers, you can indeed spawn threads and gunicorn will take no notice/interfere. The threads are reused to answer following requests immediately after returning a result, not only after all Threads are joined again.
I have used this code to fire an independent event a minute or so after a request:
Timer(timeout, function_that_does_something, [arguments_to_function]).start()
You will find some more technical details in this other answer:
In normal operations, these Workers run in a loop until the Master either tells them to graceful shutdown or kills them. Workers will periodically issue a heartbeat to the Master to indicate that they are still alive and working. If a heartbeat timeout occurs, then the Master will kill the Worker and restart it.
Therefore, daemon and non-daemon threads that do not interfere with the Worker's main loop should have no impact. If the thread does interfere with the Worker's main loop, such as a scenario where the thread is performing work and will provide results to the HTTP Response, then consider using an Async Worker. Async Workers allow for the TCP connection to remain alive for a long time while still allowing the Worker to issue heartbeats to the Master.
I have recently gone on to use asynchronous event loop based solutions like the uvicorn worker for gunicorn with the fastapi framework that provide alternatives to waiting in threads for IO.
Upvotes: 1