Reputation: 1900
Suppose I have two related models
class Foo(models.Model):
value = models.FloatField()
class Bar(models.Model):
multiplier = models.FloatField()
foo = models.ForeignKey(Foo, related_name="bars")
def multiply(self):
return self.foo.value * self.multiplier
An instance of Foo will frequently have many instances of Bar, but some information that is relevant for a calculation that Bar does is stored in Foo (because it is the same for all instances of related Bars)
The problem is when I do something like this:
foo = Foo.objects.latest()
[x.multiply() for x in foo.bars.all()]
It ends up hitting the database a lot because every Bar object in foo.bars.all() queries the database for the Foo object. So, if I have 10 Bars, then I will incur 11 database queries (1 to get the queryset with 10 bars, and 1 for each Bar object reaching back to get self.foo.value). Using select_related() doesn't seem to help.
My questions are: 1) Am I correct in thinking that memcached (e.g. Johnny Cache, Cache Machine) will solve this problem? 2) Is there a way of designing the object relationship that can make the command more efficient without a cache?
Upvotes: 1
Views: 559
Reputation: 70602
It is precisely this kind of situation for which select_related
and prefetch_related
were created. When you query using these, Django's ORM will employ one of two techniques to avoid redundant database requests: following relations via JOINs (select_related
) or pre-caching one-to-many / many-to-many relations in their QuerySets.
# Hits the database
foo = Foo.objects.prefetch_related('bars').latest()
# Doesn't hit the database
[x.value for x in foo.bars.all()]
Upvotes: 3