Nelson
Nelson

Reputation: 43

Django: Calculate difference between fields on rows from same model

I have a model which represents a duration from the start of a run in a sport:

class Time(models.Model):
    # A participator of a race
    bib = models.ForeignKey(Bib, on_delete=models.CASCADE)
    # A timing point. Could be one at 0km, 5km and at the goal
    timing_point = models.ForeignKey(TimingPoint, on_delete=models.CASCADE)
    # The duration from start
    time = models.FloatField()

I want to calculate the duration between two timing points for each bib. The result should be a QuerySet with one row for each bib. I assume it would be most efficient if the data fetching and the duration calculation could be accomplished in one database query. Can this be done and how?

To clarify the goal, this is how I do it now (exception handling removed):

    times = []
    for bib in bibs:       
        from_time = Time.objects.get(timing_point=from_point, bib=bib)        
        to_time = Time.objects.get(timing_point=to_point, bib=bib)
        times.append({'bib': bib, 'time': to_time.time - from_time.time, })        

I want to avoid hitting the database this many times and I would prefer if the result could be a QuerySet since I want to be able do additional stuff like ordering.

Upvotes: 3

Views: 1435

Answers (1)

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476503

In and higher, we can use .annotate(..) with an aggregate like Min, and a filter=... condition on that aggregate, like:

from django.db.models import F, Min, Q

t2=Min('time__time', filter=Q(time__timing_point=to_point))
t1=Min('time__time', filter=Q(time__timing_point=from_point))

Bib.objects.annotate(dtime=t2-t1)

We first introduce two variables:

  1. t2 will contain the smallest time for a related Time object with timing_point the to_point (likely there is only one); and
  2. t1 will contain the smallest time for a related Time object with timing_point the from_point.

We then make an annotation dtime with the difference between t2 and t1.

Since this is still part of the QuerySet, we can even order the Bibs on the dtime, etc.

Django will convert this to a query that looks like:

SELECT bib.*,
       (MIN(CASE WHEN time.timing_point_id = 2 THEN time.time ELSE NULL END) -
        MIN(CASE WHEN time.timing_point_id = 1 THEN time.time ELSE NULL END)) AS dtime
FROM bib
LEFT OUTER JOIN time ON bib.id = time.bib_id GROUP BY bib.id

With 2 and 1 in reality the primary keys of to_point and from_point respectively.

It might improve the efficiency of the query a bit further if you filter on the Timing model as well:

from django.db.models import F, Min, Q

t2=Min('time__time', filter=Q(time__timing_point=to_point))
t1=Min('time__time', filter=Q(time__timing_point=from_point))

Bib.objects.filter(
    time__timing_point__in=[from_point, to_point]
).annotate(dtime=t2-t1)

this will result in a query that looks like:

SELECT bib.*,
       (MIN(CASE WHEN time.timing_point_id = 2 THEN time.time ELSE NULL END) -
        MIN(CASE WHEN time.timing_point_id = 2 THEN time.time ELSE NULL END)) AS dtime
FROM bib LEFT OUTER JOIN time ON bib.id = time.bib_id
WHERE time.timing_point_id IN (1, 2)
GROUP BY bib.id

Upvotes: 5

Related Questions