anhtran
anhtran

Reputation: 2044

Django Channels: Get stuck after period of time

I run code from https://github.com/andrewgodwin/channels-examples/tree/master/multichat for around 50 users.

It goes to get stuck without any notice. Server is not down, access log has nothing special. When I stop daphne server (with Ctrl+C), it takes about 5-10 minutes to completely go down. Sometime I have to run kill command.

It is very weird when I put daphne inside supervisord, I restart it every 30 minutes using crontab, websocket can be connected normally. It's hacky but working.

My config: HAProxy => Daphne

daphne -b 192.168.0.6 -p 8000 yyapp.asgi:application --access-log=/home/admin/daphne.log

backend daphne
        balance source
        option http-server-close
        option forceclose
        timeout check 1000ms
        reqrep ^([^\ ]*)\ /ws/(.*) \1\ /\2
        server daphne 192.168.0.6:8000 check maxconn 10000 inter 5s

Debian: 9.4 (original kernel) on OVH server.
Python: 3.6.4
Daphne: 2.2.1
Channels: 2.1.2
Django: 1.11.15
Redis: 4.0.11

I know this question may be too general, but I really have no ideas with this. I tried upgrade python, re-install all the packages but it didn't work.

Upvotes: 1

Views: 1464

Answers (2)

GoodDay
GoodDay

Reputation: 31

I also get this weird issue too while running Django with selenium. This is the the only solution I found after frustrating for the entire week.

This create a middleware that intercept the HTTP request and add a 30s timeout, you can remove protocol check too by just remove the type check. Note that after timeout the function will run in background and it doesn't kill/terminate. The return value inside the function will does nothing after timeout, because it already return badrequest.

Remove the threading Lock if you don't want the request to run one by one. I have those because selenium doesn't support multiple get on same driver.

Create middleware.py inside the app

import threading, os, django
from concurrent.futures import ThreadPoolExecutor, TimeoutError
import threading
from django.http import  HttpResponseBadRequest

class TimeoutContext:
    def __init__(self, timeout):
        self.timeout = timeout
        self.executor = ThreadPoolExecutor(max_workers=1)
        self.future = None

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.executor.shutdown(wait=True)
        if exc_type is TimeoutError:
            return True  # Suppress the TimeoutError

    def run(self, func, *args, **kwargs):
        self.future = self.executor.submit(func, *args, **kwargs)
        try:
            return self.future.result(timeout=self.timeout)
        except TimeoutError:
            raise TimeoutError("Function call timed out")
        
class SequentialRequestMiddleware:
    
    __Lock = threading.Lock()
    def __init__(self, get_response):
        
        self.get_response = get_response

    def __call__(self, request):
        request_type = request.scope.get("type")
        request_path = request.path
        if request_type == "http":
            
            with TimeoutContext(30) as timeout:
                self.__Lock.acquire()
                try:
                    resposne =  timeout.run(self.get_response,request)
                    self.__Lock.release()
                    return resposne

                except: 
                    self.__Lock.release()
                    return HttpResponseBadRequest("Timeout", status=403)
            
        else: return self.get_response(request)

After that put the that middleware into your Django setting.

app.middleware.SequentialRequestMiddleware

Upvotes: 0

ok123jump
ok123jump

Reputation: 109

Well, web servers and load balancers are, in general, very bad with persistent connections. You need to give Haproxy explicit instructions so it knows when and how to timeout unused tunnels.

There are four timeouts that Haproxy will need to keep track of:

  1. timeout client
  2. timeout connect
  3. timeout server
  4. timeout tunnel

The first three are related to the initial HTTP negotiation phase of the socket connection. As soon as the connection is established, only timeout tunnel matters. You will need to tinker with the values for your own application, but some suggested values to start with are:

  1. timeout client: 25s
  2. timeout connect: 5s
  3. timeout server: 25s
  4. timeout tunnel: 3600s

In your code, that would be:

backend daphne
    balance source
    option http-server-close
    option forceclose
    timeout check 1000ms
    timeout client 25s
    timeout connect 5s
    timeout server 25s
    timeout tunnel 3600s
    reqrep ^([^\ ]*)\ /ws/(.*) \1\ /\2
    server daphne 192.168.0.6:8000 check maxconn 10000 inter 5s

You might need to tinker with the other timeouts to get a good mixture. Some timeouts that may affect your setup - and some starting values - are:

  1. timeout http-keep-alive: 1s
  2. timeout http-request: 15s
  3. timeout queue: 30s
  4. timeout tarpit: 60s

Of course, read up and customize to suit your needs.

Reference: Haproxy - Websockets Load Balancing

Upvotes: 1

Related Questions