Reputation: 4525
I have the following class:
class Instance(models.Model):
products = models.ManyToManyField(Product, blank=True)
class Product(models.Model):
description = HTMLField(blank=True, null=True)
short_description = HTMLField(blank=True, null=True)
And this form that I use to update Instances
class InstanceModelForm(InstanceValidatorMixin, UpdateInstanceLastUpdatedMixin, forms.ModelForm):
class Meta:
model = Instance
products = forms.ModelMultipleChoiceField(required=False, queryset=Product.objects.annotate(i_count=Count('instance')).order_by('i_count'))
My instance-product table is sizable (~ 1000 rows) and ever since I've added the queryset for products I am seeing web requests that are timing out due heroku's 30 second request limit.
I have the following insights:
Accuracy doesn't matter as much to me - It doesn't have to be very accurate. Yes I would like to sort products by the count of instances this product has linked to but if it's off by 5 or 10 it doesn't really matter that much.
Limited number of products - When my users are selecting products to be linked to an instance, they are primarily interested in products with less than 10 total linkages to instances. I don't know if a partial query will be accurate, but if this is possible I am open to trying.
Effort - I know there are frameworks out there that I can install to cache many things. I am looking for something that is light weight and requires less than 1 hr to get up and running.
Upvotes: 1
Views: 52
Reputation: 29977
First I would want to ensure that the performance issue actually comes from the query. I've tried to reproduce your problem:
>>> Instance.objects.count()
102499
>>> Product.objects.count()
1000
>>> sum(p.instance_set.count() for p in Product.objects.all())/Product.objects.count()
273.084
>>> list(Product.objects.annotate(i_count=Count('instance')).order_by('i_count'))
[...]
>>> from django.db import connection
>>> connection.queries[-1]
{'sql': 'SELECT "products_product"."id", "products_product"."description", "products_product"."short_description", COUNT("products_instance_products"."instance_id") AS "i_count" FROM "products_product" LEFT OUTER JOIN "products_instance_products" ON ("products_product"."id" = "products_instance_products"."product_id") GROUP BY "products_product"."id", "products_product"."description", "products_product"."short_description" ORDER BY "i_count" ASC', 'time': '0.189'}
By accident, I created a dataset that is probably quite a bit bigger than yours. As you can see, I have 1000 Products with an average of ~273 related Instances, but the query still takes less than a second (both on SQLite and PostgreSQL).
Use a one-off dyno with heroku run bash
and check if you get the same numbers.
My guess is that your performance issues are either caused by
Product
, e.g. in your Product.__str__
method.MultipleChoiceField
field. By default, it will render as a <select>
with an <option>
for each Product
. This can be quite slow, and even it wasn't, it would pretty inconvenient to use. You might want to use a different widget, like django-select2
.Upvotes: 1