Ashley Liu
Ashley Liu

Reputation: 463

Performing full text search on more than one field from model (Django 2.1)

I want to perform full text search on two fields from my model. Here is my current code:

if 'keyword' in request.GET:
    search_term = request.GET['keyword']
    vector = SearchVector('Title', weight='A') + SearchVector('Content', weight='B')
    articles = articles.annotate(similarity=TrigramSimilarity(vector, search_term),).filter(similarity__gt=0.01).order_by('-similarity')

This code returns the error message

function similarity(tsvector, unknown) does not exist

I assume this is because I am not combining the fields correctly because when I just put one single field in place of vector, it works fine. If the pg_trgm extension wasn't installed properly, trigram search wouldn't work on one field, right? What's the proper way of searching in more than one field?

Upvotes: 1

Views: 1093

Answers (2)

dirkgroten
dirkgroten

Reputation: 20702

I misread your question, the error shows you have similarity function installed, but says you can't use a search vector with the similarity function. Since you want to search for patterns with a similarity higher than a threshold, you can calculate the similarity on each field separately and return the maximum value. You can't combine them with a rank. Try this:

from django.db.models.functions import Greatest

articles.annotate(
    similarity=Greatest(
         TrigramSimilarity('Title', search_term), 
         TrigramSimilarity('Content', search_term)
    )).filter(similarity__gte=0.1).order_by('-similarity')

If you want to give a higher weight to 'Title', you can just wrap with a math function to add weight instead of using Greatest:

A = 1.0; B = 0.4
articles.annotate(
    similarity=(A/(A+B) * TrigramSimilarity('Title', search_term) 
               + B/(A+B) * TrigramSimilarity('Content', search_term))
    ).filter(similarity__gte=0.1).order_by('-similarity')

Note that the latter will decrease the similarity value if there's a good match in 'Title' and no match in 'Content' with respect to Greatest, so you probably want to set the threshold lower.

Note2: similarity looks at the full strings, so if you have a long piece of text ('Content') and just one keyword to search, similarity is going to return 0 even if the keyword is contained in the 'Content'. For full-text search, you're better off using SearchRank together with SearchVector.

Upvotes: 5

schillingt
schillingt

Reputation: 13731

I'm believe need to install the pg_trgm extension in your database. The docs indicate that it's necessary to use those functions.

Create an empty migration and use the following operation:

from django.contrib.postgres.operations import TrigramExtension

operations = [TrigramExtension()]

Upvotes: 0

Related Questions