Martol1ni
Martol1ni

Reputation: 4702

query slower with prefetch related than without?

This is the pseudocode of my models

Offer
    match = FK(Match)

Odds
    offer = FK(Offer, related_name='odds')

test1 = "[o.odds.all() for o in Offer.objects.filter(match__id=123)]"
test2 = "[o.odds.all() for o in Offer.objects.filter(match__id=123).prefetch_related('odds')]"
timeit.timeit(test1, number=1000)
>>> 3.078
timeit.timeit(test2, number=1000)
>>> 11.794

I thought that when prefetching the odds, displaying them would be no problem at all - then it turns out that it's actually slower to prefetch them. Am I doing something wrong, or do I simply not understand the use of prefetch_related?

Thanks

EDIT

I am going to query for o.odds.latest() and take use of that information only, and I'm simply trying to optimize my query. I'm just having a hard time wrapping my head around select_related and prefetch_related.

Upvotes: 0

Views: 2405

Answers (2)

Craig Labenz
Craig Labenz

Reputation: 2555

If you examine the queries Django executes, you'll see that prefetch_related() leads to an additional query with a WHERE IN (ids) clause. Those values are then mapped back to the original object in Python, preventing additional database load.

Calling prefetch_related() on unrestrained all() QuerySets against your entire database can be very costly, especially if your tables are large. However, if you are working with smaller sets of data thanks to a meaningful filter(key=value) addition to your original QuerySet, prefetch_related() will reduce N trips to the database down to 1 trip to the database. This can lead to a drastic performance increase.

And as a side note -- don't run prefetch_related("odds") unless you need that data. Your test isn't making use of the extra information, but it's still doing the extra work to fetch it. There's no free lunch :)

Upvotes: 2

petkostas
petkostas

Reputation: 7460

Since you are using an FK you could use a select_related rather than prefetch_related and see how this compares:

select_related works by creating an SQL join and including the fields of the related object in the SELECT statement. For this reason, select_related gets the related objects in the same database query. However, to avoid the much larger result set that would result from joining across a ‘many’ relationship, select_related is limited to single-valued relationships - foreign key and one-to-one.

prefetch_related, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python. This allows it to prefetch many-to-many and many-to-one objects, which cannot be done using select_related, in addition to the foreign key and one-to-one relationships that are supported by select_related. It also supports prefetching of GenericRelation and GenericForeignKey.

Upvotes: 3

Related Questions