Vishal
Vishal

Reputation: 2336

Django Models - SELECT DISTINCT(foo) FROM table is too slow

I have a MySQL table with 13M rows. I can query the db directly as

SELECT DISTINCT(refdate) FROM myTable

The query takes 0.15 seconds and is great.

The equivalent table defined as a Django model and queried as

myTable.objects.values(`refdate`).distinct()

takes a very long time. Is it because there are too many items in the list before distinct(). How do I do this in a manner that doesn't bring everything down?

Upvotes: 0

Views: 1739

Answers (2)

Vishal
Vishal

Reputation: 2336

Thank you @solarissmoke for the pointer to connection.queries.

I was expecting to see

SELECT DISTINCT refdate FROM myTable

Instead, I got

SELECT DISTINCT refdate, itemIndex, itemType FROM myTable ORDER BY itemIndex, refdate, itemType. 

I then looked at myTable defined in models.py.

unique_together = (('nodeIndex', 'refdate', 'nodeType'), )
ordering = ['nodeIndex', 'refdate', 'nodeType']

From Interaction with default ordering or order_by

normally you won’t want extra columns playing a part in the result, so clear out the ordering, or at least make sure it’s restricted only to those fields you also select in a values() call.

So I tried order_by() to flush the previously defined ordering and voila!

myTable.objects.values('refdate').order_by().distinct()

Upvotes: 1

Nguyễn Hải Triều
Nguyễn Hải Triều

Reputation: 1464

You can try this:

myTable.objects.all().distinct('refdate')

Upvotes: 1

Related Questions