Thuwarakesh Murallie
Thuwarakesh Murallie

Reputation: 103

Poor Performance when trigram similarity and full-text-search were combined with Q ind django using postgres

I'm creating a web application to search people with their properties such as education, experience, etc. I can't use full-text-search for all the fields, because, some has to be a fuzzy match. (Eg: if we search for biotech, it should pick bio tech, biotech and also bio-tech). My database has about 200 entries in the profile model, which is to appear in the search results.

Other models like education and experience are connected to profile through foreign key

Therefore, I decided to be selective on what method to use on what field. For shorter fields like degree name (In the Education model) I want to use trigram similarity. For fields like education description, I use Full-text search.

However, Since I have to do this in multiple fields, I used simple lookups instead of using search vectors.

Profile.objects.filter(
    Q(first_name__trigram_similar=search_term) |
    Q(last_name__trigram_similar=search_term) |
    Q(vision_expertise__search=search_term) |
    Q(educations__degree__trigram_similar=search_term) |
    Q(educations__field_of_study__trigram_similar=search_term) |
    Q(educations__school__trigram_similar=search_term) |
    Q(educations__description__search=search_term) |
    Q(experiences__title__trigram_similar=search_term) |
    Q(experiences__company__trigram_similar=search_term) |
    Q(experiences__description__search=search_term) |
    Q(publications__title__trigram_similar=search_term) |
    Q(publications__description__search=search_term) |
    Q(certification__certification_name__trigram_similar=search_term) |
    Q(certification__certification_authority__trigram_similar=search_term) |
    Q(bio_description__search=search_term) |
)

I get the expected results on every search. However, the time it takes to get it is ridiculously slow. I can't figure it out how to make this faster.

Upvotes: 2

Views: 3042

Answers (2)

Nick
Nick

Reputation: 51

Add Trigram indexes in postgres to improve performance, ie for users.first_name and users.last_name indexing:

CREATE INDEX index_users_full_name
             ON users using gin ((first_name || ' ' || last_name) gin_trgm_ops);

Upvotes: 0

Paolo Melchiorre
Paolo Melchiorre

Reputation: 6112

Without the class code it's difficult to find the better way to optimize your query.

You can add a Gin or Gist index to speed up the trigram similarity.

You can build an annotation with the SearchVector as below:

from django.contrib.postgres.aggregates import StringAgg
from django.contrib.postgres.search import SearchQuery, SearchVector

search_vectors = (
    SearchVector('vision_expertise') +
    SearchVector('bio_description') +
    SearchVector(StringAgg('experiences__description', delimiter=' ')) +
    SearchVector(StringAgg('educations__description', delimiter=' ')) +
    SearchVector(StringAgg('publications__description', delimiter=' '))
)

Profile.objects.annotate(
    search=search_vectors
).filter(
    Q(search=SearchQuery(search_term)) |
    Q(first_name__trigram_similar=search_term) |
    Q(last_name__trigram_similar=search_term) |
    Q(educations__degree__trigram_similar=search_term) |
    Q(educations__field_of_study__trigram_similar=search_term) |
    Q(educations__school__trigram_similar=search_term) |
    Q(experiences__title__trigram_similar=search_term) |
    Q(experiences__company__trigram_similar=search_term) |
    Q(publications__title__trigram_similar=search_term) |
    Q(certification__certification_name__trigram_similar=search_term) |
    Q(certification__certification_authority__trigram_similar=search_term)
)

You can speed-up the full-text search using a SearchVectorField

To find out about full-text search and trigram you can read the article I wrote on the subject:

"Full-Text Search in Django with PostgreSQL"

Upvotes: 3

Related Questions