ppyhd
ppyhd

Reputation: 35

Filter latest record in Django

Writing my first Django app that gets messages from other applications and stores reports about them.

It is performing very slow due to the following logic that I hope can be improved but I'm struggling to find a way to do it with out a loop.

Basically I'm just trying to go through all of the apps (there are about 500 unique ones) and get the latest report for each one. Here are my models and function:

class App(models.Model):
    app_name = models.CharField(max_length=200)
    host = models.CharField(max_length=50)

class Report(models.Model):
    app = models.ForeignKey(App)
    date = models.DateTimeField(auto_now_add=True)
    status = models.CharField(max_length=20)
    runtime = models.DecimalField(max_digits=13, decimal_places=2,blank=True,null=True)
    end_time = models.DateTimeField(blank=True,null=True)



def get_latest_report():
    """ Returns the latest report from each app """
    lset = set()
    ## get distinct app values
    for r in Report.objects.order_by().values_list('app_id').distinct():
        ## get latest report (by date) and push in to stack.
        lreport = Report.objects.filter(app_id=r).latest('date')
        lset.add(lreport.pk)
    ## Filter objects and return the latest runs
    return Report.objects.filter(pk__in = lset)

Upvotes: 2

Views: 306

Answers (2)

Sebastian Wozny
Sebastian Wozny

Reputation: 17506

If you're not afraid of executing a query for every app in your database you can try it this way:

def get_latest_report():
    """ Returns the latest report from each app """
    return [app.report_set.latest('date') for app in App.objects.all()]

This adds a query for every app in your database, but is really expressive and sometimes maintainability and readability are more important than performance.

Upvotes: 3

Iain Shelvington
Iain Shelvington

Reputation: 32244

If you are using PostgreSQL you can use distinct and order_by in combination, giving you the latest report for each app like so

Report.objects.order_by('-date').distinct('app')

If you are using a database that does not support the DISTINCT ON clause, MySQL for example, and you do not mind changing the default ordering of the Report model, you can use prefetch_related to reduce 500+ queries to 2 (however this method will use a lot more memory as it will load every report)

class Report(models.Model):

    # Fields

    class Meta:
        ordering = ['-date']

def get_latest_report():
    latest_reports = []
    for app in App.objects.all().prefetch_related('report_set'):
        try:
            latest_reports.append(app.report_set.all()[0])
        except IndexError:
            pass
    return latest_reports

Upvotes: 3

Related Questions