HyeongJin Kim
HyeongJin Kim

Reputation: 37

Increase the performance in Django ORM

contents = Content.objects.filter(   
    is_excluded=False
).prefetch_related('comments', 'likes', 'dislikes').order_by('-id').all()[:500]
result = [(v, len(v.comments.all())) for v in contents if len(v.likes.all()) -len(v.dislikes.all()) > 24][:7]

I want to return the content which has more than 25 like and 500 recently content.
This works, but is there other way to increase the performance?

Upvotes: 3

Views: 176

Answers (1)

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476659

You can filter at the database side already. This will prevent making an extra query for the .prefetch_related(…) but will also limit the bandwidth used to transfer data:

from django.db.models import Count

contents = Content.objects.annotate(
    nlikes=Count('likes')
).filter(
    is_excluded=False,
    nlikes__gt=24
).order_by('-id')[:500]

This will thus obtain the 500 Contents with the highest primary key that have more than 24 likes.

You can for example filter the balance between likes and dislikes with:

from django.db.models import Count, Value
from django.db.models.functions import Coalesce

contents = Content.objects.annotate(
    nlikes=Coalesce(Count('likes', distinct=True), Value(0))-Coalesce(Count('dislikes', distinct=True), Value(0))
).filter(
    is_excluded=False,
    nlikes__gt=24
).order_by('-id')[:500]

Here the distinct=True [Django-doc] is necessary, since there are two JOINs, and without a uniquness filter, the two aggregates would thus act as multipliers of each other.

That being said, I'm not sure if it is a good idea to have two related models for likes and dislikes. Why not make one model (like Thumb), with a value +1 for a like, and -1 for a dislike. It makes the modeling simpeler, since then you can use a Sum expression [Django-doc] instead.

Upvotes: 4

Related Questions