Ahmed Ibrahim
Ahmed Ibrahim

Reputation: 23

Simple query causes memory leak in Django

I work in a company that has a large database and I want to perform some update queries on it but it seems to cause a huge memory leak the query is as follow

c= CallLog.objects.all()
for i in c:
   i.cdate = pytz.utc.localize(datetime.datetime.strptime(i.fixed_date, "%y-%m-%d %H:%M"))
   i.save()

I wrote this in the interactive shell of Django

I even tried to use

with transaction.atomic()

but it didn't work, do you have any idea how can I detect the source of

the dataset I am working on is about 27 million

fixed_date is a calculated property

Upvotes: 2

Views: 1138

Answers (3)

mathias.lantean
mathias.lantean

Reputation: 611

You could try something like this:

from django.core.paginator import Paginator

p = Paginator(CallLog.objects.all().only('cdate'), 2000)
for page in range(1, p.num_pages + 1):
    for i in p.page(page).object_list:
        i.cdate = pytz.utc.localize(datetime.datetime.strptime(i.fixed_date, "%y-%m-%d %H:%M"))
        i.save()

Slicing a query set does not load all the objects in memory only to get a subset but adds limit and offset to the SQL query before hitting the database.

Upvotes: 1

Walucas
Walucas

Reputation: 2578

Try breaking it into small blocks (since you have only 4gb of ram)

c= CallLog.objects.filter(somefield=somevalue)

When its necessary, I usually use a character or number (ID enting in 1,2,3,4 etc)

Upvotes: 0

Ralf
Ralf

Reputation: 16515

You could try to iterate the queryset in batches; see the .iterator() method. See if that improves anything

for obj in CallLog.objects.all():
    obj.cdate = pytz.utc.localize(
        datetime.datetime.strptime(obj.fixed_date, "%y-%m-%d %H:%M"))
    obj.save()

Here is a related answer I found, but it is a few years old.

Upvotes: 0

Related Questions