Ziyan Junaideen
Ziyan Junaideen

Reputation: 3300

PostgreSQL - Text search with to_tsvector - Omitting some records

Background

I am having issues with a PostgreSQL text search with to_tsvector. It is mostly working exept it doesn't find records when given "Don" as the query string.

Details

After some tries I wrote a query (a Rails scope) to find users by their name. The database has 2 columns in the user table first_name and last_name. Since it is used in an autocomplete search I would like it to return results regardless if its part of the first name or last name ex:

User.with_name("ab") -> User(first_name: "Abc", last_name: "Def")
User.with_name("De") -> User(first_name: "Abc", last_name: "Def")

What I came up with is some thing like this:

  scope :with_name, lambda { |name|
    return scoped if name.blank?
    name  = name.to_s.strip
    value = name.gsub(' ', '\\ ').gsub('(', '\(').gsub(')', '\)')
    value << ":*" if name.present?
    where(
      "to_tsvector('english', first_name || ' ' || last_name) @@ to_tsquery('english', ?)",
      value
    )
  }

It has worked well for years for me, except last week there was a complaint that a specific user was not showing any results when searching "Don". I can't find any reason to as why "Don" isn't included. In fact there are 2 "Don"s.

I noticed that 3 letter names are showing up. Also if I do "Don Sh" with part of the last name the user shows up in results.

In case you would like t have the a look at the generated SQL:

SELECT "users".* FROM "users"  WHERE (to_tsvector('english', first_name || ' ' || last_name) @@ to_tsquery('english', 'Don:*'))

Help

  1. Is there a better way to implement a search for an autocompelte field?
  2. Why is this happening and can this be corrected?

Upvotes: 1

Views: 658

Answers (1)

jjanes
jjanes

Reputation: 44323

Don is a stop-word in English. I assume that is because it is what don't gets parsed down to; as don itself (the opposite of doff) is not all that common of a word in itself.

You could switch from 'english' to 'simple', as 'simple' doesn't implement stop words. This would require changing your program and rebuilding the index, of course.

Alternatively, you could also just remove don from SHAREDIR/tsearch_data/english.stop, and then rebuild the index.

Upvotes: 2

Related Questions