Christian Safka
Christian Safka

Reputation: 317

Django Trigram Similarity query not as fast as raw query with same result

The objective is to have a fast search for similar values, in order of similarity.

This is the query in Django: Model.objects.annotate(similarity=TrigramSimilarity('field_name', query)).filter(similarity__gt=0.3).order_by('-similarity').values('field_name')[0:30]

The above is converted to this query: SELECT "table"."field_name" FROM "table" WHERE SIMILARITY("table"."field_name", query_value) > 0.3 ORDER BY SIMILARITY("table"."field_name", query_value) DESC LIMIT 30

That takes around 1.3 seconds, but if I run this in psql:

SELECT field_name FROM table WHERE field_name % 'query_value' ORDER BY similarity(field_name, 'query_value') DESC LIMIT 30

It's much faster and has the same correct output. How can I achieve this with Django, besides putting in the raw sql command?

Another question is how to add SELECT DISTINCT field_name to that. Currently when trying to add it I get ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list even though I only use the selected field_name in the ORDER BY simlarity function.

Thanks a lot!

Upvotes: 0

Views: 1016

Answers (1)

n1_
n1_

Reputation: 4397

Kinda old question but I was messing up with trigrams latgely and run into similar issue.

Very important thing is (as mentioned here - https://dba.stackexchange.com/questions/103821/best-index-for-similarity-function/103823#103823) indexes are not used when you use functions! You have to use operator - so %.

Upvotes: 2

Related Questions