Gary Chambers
Gary Chambers

Reputation: 25868

Problem writing a database query

I have two models, Location and Event, that are linked by ForeignKey on the Event model. The models break down as follows:

class Location(models.Model):
    city = models.CharField('city', max_length=25)
    slug = models.SlugField('slug')

class Event(models.Model):
    location = models.ForeignKey(Location)
    title = models.CharField('title', max_length=75)
    start_date = models.DateTimeField('start_date')
    end_date = models.DateTimeField('end_date')

Each location has multiple events that are ordered by start_date descending. The query I am trying to make retrieves the next upcoming event for each of the locations.

Ideally I would like to make this in a single query (I don't want to run a query for every location, as this would cause a lot of unnecessary hits to the database). I have tried using Django's ORM, and I've also tried using raw SQL, but I've hit a bit of a roadblock.

Any help would be greatly appreciated.

Update

I have come up with a potential solution, though I'm not convinced that it's the best method. It works, which should be enough, but I'm curious to see what the best method of doing this would be.

Anyway, the code I've written reads thus:

l = Location.objects.select_related()
qs = None

# Concatenate the related event querysets into a new queryset
for e in l:
    if qs is None:
        qs = e.event_set.all()
    else:
        qs = qs | e.event_set.all()

# Order a slice of the queryset by start_date ascending
qs = sorted(qs[:l.count()], key=lambda s: s.start_date)

Upvotes: 2

Views: 183

Answers (4)

S.Lott
S.Lott

Reputation: 392010

"Ideally I would like to make this in a single query (I don't want to run a query for every location, as this would cause a lot of unnecessary hits to the database)."

This is a false assumption.

1) Django's ORM uses a cache. It may not query the database as often as you think. Databases have a cache. The cost of a query may not be what you think, either.

2) You have select_related. The ORM can do the join for you. http://docs.djangoproject.com/en/dev/ref/models/querysets/#id4

Just write the simplest possible loop for fetching Locations and Events. In the very unlikely event that this is the slowest part of your application (and you can prove that it's slowest), then add select_related and see how much that improves things.

Until you can prove that this specific query is killing your application, move on and don't worry about "hits to the database".

Next event in each location?

[ l.event_set.order_by( start_date ).all()[0] for l in Location.objects.select_related().all() ]

Or perhaps

events = []
for l in Location.objects.select_related().all():
    events.append( l.event_set.order_by( start_date ).all()[0] )

And return that to the template to be rendered.

Don't dismiss this until you've benchmarked it and proven that it is the bottleneck in your application.

Upvotes: 1

Something like the following may be useful:

SELECT * FROM EVENTS V
  WHERE (V.LOCATION, V.START_DATE) IN
    (SELECT E.LOCATION, MIN(E.START_DATE)
       FROM EVENTS E
       WHERE E.START_DATE >= NOW
       GROUP BY E.LOCATION)

Share and enjoy.

Upvotes: 0

Pentium10
Pentium10

Reputation: 208022

select id, (
    select * from event 
    where location=location.id 
    and start_date>NOW() 
    order by start_date asc 
    limit 1
    )
 from location

Upvotes: 1

Alexey
Alexey

Reputation: 158

I think you should look at django aggregationlink text, so your result will be agragated by location with a condition to filter future/expired events and min(start_time) returns next event time

Upvotes: 0

Related Questions