Reputation: 2700
I have a custom Django (v 2.0.0) command to start background job executers in a multi-threaded fashion which seems to give me memory leak issues.
The command can be started like so:
./manage.py start_job_executer --thread=1
Each thread has a while True loop that picks up jobs from a PostgreSQL table.
In order to pick up the job and change the status atomically I used transactions:
# atomic transaction to temporary lock the db access and to
# get the most recent job from db with column status = pending
with transaction.atomic():
job = Job.objects.select_for_update() \
.filter(status=Job.STATUS['pending']) \
.order_by('created_at').first()
if job:
job.status = Job.STATUS['executing']
job.save()
Il looks like the allocated memory by this Django custom command keeps growing.
Using tracemalloc I tried to find what is causing the memory leak by creating a background thread that checks the memory allocation:
def check_memory(self):
while True:
s1 = tracemalloc.take_snapshot()
sleep(10)
s2 = tracemalloc.take_snapshot()
for alog in s2.compare_to(s1, 'lineno')[:10]:
log.info(alog)
Finding out the following log:
01.04.20 13:50:06 operations.py:222: size=23.7 KiB (+23.7 KiB), count=66 (+66), average=367 B
01.04.20 13:50:36 operations.py:222: size=127 KiB (+43.7 KiB), count=353 (+122), average=367 B
01.04.20 13:51:04 operations.py:222: size=251 KiB (+66.7 KiB), count=699 (+186), average=367 B
01.04.20 13:51:31 operations.py:222: size=379 KiB (+68.9 KiB), count=1056 (+192), average=367 B
01.04.20 13:51:57 operations.py:222: size=495 KiB (+60.3 KiB), count=1380 (+168), average=367 B
Looks like /usr/local/lib/python3.5/dist-packages/django/db/backends/postgresql/operations.py:222 does not release memory
The leakage is slow for 1 thread but if I use 8 threads the memory leak is worse:
01.04.20 13:07:51 operations.py:222: size=68.3 KiB (+68.3 KiB), count=191 (+191), average=366 B
01.04.20 13:08:56 operations.py:222: size=770 KiB (+140 KiB), count=2151 (+390), average=367 B
01.04.20 13:10:07 operations.py:222: size=1476 KiB (+138 KiB), count=4122 (+386), average=367 B
01.04.20 13:36:22 operations.py:222: size=17.3 MiB (+138 KiB), count=49506 (+385), average=367 B
01.04.20 13:48:16 operations.py:222: size=24.5 MiB (+136 KiB), count=69993 (+379), average=367 B
This is the code at line 222 in /usr/local/lib/python3.5/dist-packages/django/db/backends/postgresql/operations.py:222:
def last_executed_query(self, cursor, sql, params):
# http://initd.org/psycopg/docs/cursor.html#cursor.query
# The query attribute is a Psycopg extension to the DB API 2.0.
if cursor.query is not None:
return cursor.query.decode() # this is line 222!
return None
I have no clue how to attack this problem. Any ideas at all?
Posted it here also: https://code.djangoproject.com/ticket/31419#ticket
I was thinking to maybe fork a new process for every job that needs to be executed, and once finished, the memory would be deallocated with the process itself dying. This would probably work but it seems a little overkill.
Thanks in advance
UPDATE
I was using Django 2.0 and I thought to update to Django 3.0.5 (latest stable release), but unfortunately the problem is still there.
Below the new logs:
01.04.20 20:15:06 operations.py:235: size=977 KiB (+53.9 KiB), count=2750 (+152), average=364 B
01.04.20 20:15:28 operations.py:235: size=1070 KiB (+50.1 KiB), count=3012 (+141), average=364 B
01.04.20 20:15:53 operations.py:235: size=1156 KiB (+43.7 KiB), count=3255 (+123), average=364 B
01.04.20 20:16:19 operations.py:235: size=1245 KiB (+44.7 KiB), count=3507 (+126), average=364 B
01.04.20 20:20:23 operations.py:235: size=2154 KiB (+44.3 KiB), count=6065 (+125), average=364 B
Upvotes: 6
Views: 1911
Reputation: 5116
Django keeps a reference to all executed queries in a ring buffer when settings.DEBUG = True
From DEBUG
documentation
It is also important to remember that when running with
DEBUG
turned on, Django will remember every SQL query it executes. This is useful when you’re debugging, but it’ll rapidly consume memory on a production server.
Setting DEBUG = False
should address your issue.
To wipe the ring buffer in situations where it may pose a problem in development:
from django.db import reset_queries
if settings.DEBUG:
reset_queries()
Upvotes: 6