Kevin G.
Kevin G.

Reputation: 1453

Does django + mod_wsgi require threaded programming discipline?

We're rolling out our first django application under mod_wsgi as

`WSGIDaemonProcess our-appname processes=6 threads=15'`

and we're having a discussion about whether our Python code and the Redis and Postgres libraries it uses needs to be thread-safe or not.

From what I can tell from reading the mod_wsgi documentation, even though the apache worker is processing requests with multiple apache threads, our python code is for all intents and purposes single-threaded. I don't see any warnings on the mod_wsgi docs saying "Beware!! You now have to worry about Global Data and Thread Safety!" but there's also no explicit "Don't Worry About Threads There Aren't Any".

We're not doing anything explicitly with threads in our python code, there is no mention of them in anything we've written.

But some people here are of the opinion that since we're running with threads=15 that we are now in in the multi-threaded world.

Can anyone clarify what's actually going on here? Is our Python code now subject to multiple threads of execution through the same data where it wasn't before, or not?

Upvotes: 6

Views: 2235

Answers (3)

weatherfrog
weatherfrog

Reputation: 3190

I think Andrew's answer is a little bit misleading. The fact that CPython (note that there are other Python implementations such as Jython and PyPy) has the GIL does not mean you don't have to worry about your code being thread safe! Because of the GIL, no two threads in one process can be active at the same time. But parallelism is simulated by periodically switching between threads. And such a context switch can happen anytime during the execution of your program. As an example, if you have a module foo containing a "global" variable x, then the following method might output anything from 2, 3, 4, ..., depending on the number of threads executing that same method:

def bar():
    foo.x = 1
    # a context switch might happen here!
    foo.x = foo.x + 1
    # or here!
    print(foo.x)

Actually, you can configure mod_wsgi to use max. 1 thread. And then you don't have to worry about thread safety. But the correctness of your program would depend on the configuration of the web server, which is a very undesirable situation.

Upvotes: 1

Andrew Gorcester
Andrew Gorcester

Reputation: 19953

The python interpreter is not thread-safe, particularly because of reference counting, so threads cannot access python objects concurrently in the same process space. There is no way you can configure mod_wsgi to get around this, accidentally or intentionally, because the interpreter is protected by the GIL (global interpreter lock). Therefore, you don't have to worry about the particularly tricky thread-safety issues that come with the risk of simultaneous threads accessing the same memory objects (memory locking etc).

Some webservers such as gunicorn with a gevent backer will have multiple threads in memory concurrently so that no individual process needs to be blocked on I/O (database access, network access, etc). This may also be the case with mod_wsgi. However, this is implemented in such a way that you shouldn't need to worry about it in your application code -- if your application is safe to use in multiprocessing it should also be safe to use in this sort of limited non-concurrent threading model.

Of course, you can't use global variables or dynamically edit parts of your application as it runs, but if you are doing anything like that in Django then you will encounter problems even before you have to worry about threading. Django and other web frameworks are designed so that data is passed in as a request and out as a response without having to worry about thread/process safety within that model.

You DO need to worry about concurrent access to data stores (esp. database entries), as for any web application. Code defensively when it comes to database access.

Upvotes: 1

Anurag Uniyal
Anurag Uniyal

Reputation: 88727

Yes obviously you are running multi-threaded app and it will create problems if you don't take care with globals, Class attributes etc

If you need to keep something globally, keep it in thread local storage.

Here is a quote from modwsgi doc, Building_A_Portable_Application

3 . An application must be re-entrant, or simply put, be able to be called concurrently by multiple threads at the same time. Data which needs to exist for the life of the request, would need to be stored as stack based data, thread local data, or cached in the WSGI application environment. Global variables within the actual application module cannot be used for this purpose.

So I think you have been sufficiently warned.

Upvotes: 6

Related Questions