Reputation: 60556
I have a table containing 100m rows and i need to full text search it
and provide information about how similar (e.g. with the pg_trgm
module)
the text's are. Off cause the problem here is that it should be fast.
I tried gist and gin indexes, had a extra column with the tsvector of my field etc.
My idea is to query first using tsvector and after that running the similarity function provided by the pg_trgm
module.
My problem is the following. If i use a whole word as my query it will work. But not if i append something.
This makes total sense because the tsvector of "A quick brown fox..." is
"'a':1 'brown':3 'fox':4 'quick':2"
.
I hope i made clear what i would like to achieve.
select to_tsvector('A quick brown fox...') @@ to_tsquery('quick') -- true
select to_tsvector('A quick brown fox...') @@ to_tsquery('quicks') -- false
Any ideas on how to achieve that using postgresql?
Upvotes: 1
Views: 1736
Reputation: 171
You need to set the language configuration parameter, like this:
select to_tsvector('english', 'A quick brown fox...') @@ to_tsquery('english', 'quicks')
The capability to recognize lexemes correctly, with plurals and stuff only happens when tsquery
and tsvector
have the same language configuration.
Upvotes: 1