43Tesseracts
43Tesseracts

Reputation: 4937

Optimizing a slow django queryset from a large and growing dataset

I have a page that loads way too slow. Somehow I need to improve the way it's data is queried (caching? partial load/pages? etc.)

Note I am a django noob and haven't quite wrapped my head around model.Manager and and models.query.QuerySet so if this setup looks awkward....

Currently, the page is taking about 18 seconds to load the queryset, and it currently only has about 500 records. There will be on average about 100 new records a day.

Network stats

The database is Postgresql

The slow view:

def approvals(request):
    ...
    approved_submissions = QuestSubmission.objects.all_approved()
    ...

The slow queryset:

class QuestSubmissionQuerySet(models.query.QuerySet):
    ...

    def approved(self):
        return self.filter(is_approved=True)

    def completed(self):
         return self.filter(is_completed=True).order_by('-time_completed')

    ...

class QuestSubmissionManager(models.Manager):
    def get_queryset(self):
        return QuestSubmissionQuerySet(self.model, using=self._db)

    def all_approved(self, user=None):
        return self.get_queryset().approved().completed()

    ...

Resulting SQL from QuestSubmission.objects.all_approved():

'SELECT "quest_manager_questsubmission"."id", "quest_manager_questsubmission"."quest_id", "quest_manager_questsubmission"."user_id", "quest_manager_questsubmission"."ordinal", "quest_manager_questsubmission"."is_completed", "quest_manager_questsubmission"."time_completed", "quest_manager_questsubmission"."is_approved", "quest_manager_questsubmission"."time_approved", "quest_manager_questsubmission"."timestamp", "quest_manager_questsubmission"."updated", "quest_manager_questsubmission"."game_lab_transfer" FROM "quest_manager_questsubmission" WHERE ("quest_manager_questsubmission"."is_approved" = True AND "quest_manager_questsubmission"."is_completed" = True) ORDER BY "quest_manager_questsubmission"."time_completed" DESC'

The slow model:

class QuestSubmission(models.Model):
    quest = models.ForeignKey(Quest)
    user = models.ForeignKey(settings.AUTH_USER_MODEL, related_name="quest_submission_user")
    ordinal = models.PositiveIntegerField(default = 1, help_text = 'indicating submissions beyond the first for repeatable quests')
    is_completed = models.BooleanField(default=False)
    time_completed = models.DateTimeField(null=True, blank=True)
    is_approved = models.BooleanField(default=False)
    time_approved = models.DateTimeField(null=True, blank=True)
    timestamp = models.DateTimeField(auto_now=True, auto_now_add=False)
    updated = models.DateTimeField(auto_now=False, auto_now_add=True)
    game_lab_transfer = models.BooleanField(default = False, help_text = 'XP not counted')

    class Meta:
        ordering = ["time_approved", "time_completed"]

    objects = QuestSubmissionManager()

    #other methods
    ....

What strategies are available to fix this? I tried using django's Paginator, but it only seems to display in pages, but it still loads the entire queryset.

Upvotes: 6

Views: 9248

Answers (4)

raratiru
raratiru

Reputation: 9616

I would construct a base queryset to build upon and apply the filters needed:

def approvals(request):
    ...
    approved_submissions = QuestSubmission.objects.select_related('quest', 'user').all_approved()
    ...

Upvotes: 4

nima
nima

Reputation: 6733

This is more of a browser issue rather than a database issue. With Django pagination you can select one page at a time and lower the load on database as well as the browser: Django Pagination

Upvotes: 1

Alex Gaynor
Alex Gaynor

Reputation: 15009

The first thing to look at:

  • Is this query slow because it is returning a very large result set?

or

  • Is this query slow because it's taking a while to filter down the table?

Assuming the former, you don't have a lot of good options besides "return less data".

If it's the latter, you should probably run an EXPLAIN on the database, but right off the bat I'd say you probably want an index, possibly on (is_approved, is_completed). Which can be done with:

class Meta:
    index_together = [
        ["is_completed", "is_approved"],
    ]

Upvotes: 9

Paulo Pessoa
Paulo Pessoa

Reputation: 2569

If you are showing the related objects in your page try to use the select_related()

Without select_related(), this would make a database query for each loop iteration in order to fetch the related blog for each entry.

Upvotes: 4

Related Questions