Saturnix
Saturnix

Reputation: 10564

Adding 'only' before prefetch related slows down query

In pseudocode, I have a db model like this:

Model T
    name
    tags = models.ManyToManyField(Tag, related_name="Ts")
    symbol = models.ForeignKey(Symbol)

Model Symbol
    name
    category = models.ForeignKey(Category)

Model Tag
    name

And this is the code I use to export it:

query = T.objects.annotate(category_id=F('symbol__category_id')).prefetch_related('tags')

for t in query:
    _dict = model_to_dict(t)
    _dict["category_id"] = t.category_id

    _tags = []
    for tag in _dict["tags"]
        _tags.append(tag.id)
    _dict["tags"] = _tags

In this code, _dict gives me the wanted result.

However, T has many other fields I don't need, so I changed query to:

T.objects.only("name", "symbol", "tags").annotate(category=F('symbol__category_id')).prefetch_related('tags')

For some reason, this slows down the execution.

Original query takes 6 seconds while the last one takes 8 seconds. Why?

How can I prefetch everything correctly so that I don't have to loop over tags and append their ids in a dictionary? How can I do this while also using .only()?

EDIT:

for some reason, using .defer() instead of only, and indicating the fields I don't want, works without any performance hit.

What's the difference between defer and only, and why one creates a performance bottleneck?

Upvotes: 1

Views: 130

Answers (1)

Saturnix
Saturnix

Reputation: 10564

With fields = ("name", ...)

from django.db.models import Prefetch
prefetch = Prefetch('tags', queryset=Tag.objects.only('id'))
query = T.objects.prefetch_related(prefetch).annotate(category_id=F('symbol__category_id')).only(*fields)

for t in query:
    _dict = model_to_dict(t, fields=[*fields, tags])
    _dict["category_id"] = t.category_id
    _dict["tags"] = [tag.id for tag in _dict["tags"]]

Upvotes: 1

Related Questions