Matt Huggins
Matt Huggins

Reputation: 83269

Solr (sunspot) not finding partial word match when suffix included

I'm implementing solr into a Rails app, specifically pertaining to ingredient searches. If I do a partial word match on a simple noun like "beef" or "chicken", I can type in any number of letters from 1 to the full string, and it finds ingredients containing those words. The problem comes into play when I have a word with a suffix, such as "eggs" (-s), "baked" (-ed), or "baking" (-ing).

Let's take "baking" as an example. I can search for "b", "ba", or "bak" to have any results with the word "baking" return. If I search for "baki", "bakin", or "baking", this leads to no results found.

I'm wondering if I'm doing something wrong with either my Rails code for the search, or if I need to edit something in the schema.xml file. My schema is the default provided by sunspot. My model & search code looks like the following.

class Ingredient < ActiveRecord::Base
  validates :name, presence: true, uniqueness: true

  searchable do
    text :name
  end

  def self.search_by_partial_name(name)
    keywords = name.to_s.split(/\s+/).delete_if(&:blank?)

    search = Sunspot.search(self) do
      text_fields do
        keywords.each do |keyword|
          with(:name).starting_with(keyword)
        end
      end
    end

    search.results
  end
end

Searching:

Ingredient.search_by_partial_name('baki')  # => []
Ingredient.search_by_partial_name('bak')   # => [<Ingredient "baking powder">,
                                                 <Ingredient "baking potato">,
                                                 ...]

Thanks!

Edit: Here are the logs regarding the solr queries being performed for the above two examples.

Started GET "/admin/ingredients/search?term=bak" for 127.0.0.1 at 2014-11-23 09:21:01 -0700
Processing by Admin::IngredientsController#search as JSON
  Parameters: {"term"=>"bak"}
  User Load (0.4ms)  SELECT  "users".* FROM "users"  WHERE "users"."id" = 1  ORDER BY "users"."id" ASC LIMIT 1
  SOLR Request (4.9ms)  [ path=select parameters={fq: ["type:Ingredient", "name_text:bak*"], start: 0, rows: 30, q: "*:*"} ]
  Ingredient Load (0.8ms)  SELECT "ingredients".* FROM "ingredients"  WHERE "ingredients"."id" IN (9853, 9858, 10099, 10281, 10289, 10295, 10350, 10498, 10507, 10583, 10733, 10787, 11048, 11148, 11395, 11603, 11634, 11676, 11734, 11863, 12031, 12189, 12268, 12399, 13128, 13577, 13830, 13886, 14272, 14366)
Completed 200 OK in 12ms (Views: 1.3ms | ActiveRecord: 1.1ms | Solr: 4.9ms)

Started GET "/admin/ingredients/search?term=baki" for 127.0.0.1 at 2014-11-23 09:21:22 -0700
Processing by Admin::IngredientsController#search as JSON
  Parameters: {"term"=>"baki"}
  User Load (0.4ms)  SELECT  "users".* FROM "users"  WHERE "users"."id" = 1  ORDER BY "users"."id" ASC LIMIT 1
  SOLR Request (4.5ms)  [ path=select parameters={fq: ["type:Ingredient", "name_text:baki*"], start: 0, rows: 30, q: "*:*"} ]
Completed 200 OK in 7ms (Views: 0.4ms | ActiveRecord: 0.4ms | Solr: 4.5ms)

Upvotes: 3

Views: 1657

Answers (2)

coorasse
coorasse

Reputation: 5528

Add an asterisk at the end of the search query:

Ingredient.search_by_partial_name('baki*')

Upvotes: 2

Yann Yu
Yann Yu

Reputation: 11

Can you post the logs/actual solr queries that are generated by the following two queries?

Ingredient.search_by_partial_name('baki')  # => []
Ingredient.search_by_partial_name('bak')   # => [<Ingredient "baking powder">,

It'd help to see that information, to see exactly what's being fed to Solr and therefore what Solr is trying to do.

Edit: Given that you want partial matches, I'm assuming that this is meant to be an "auto-complete" type search rather than a standard full-text search. If that's the case, then you likely don't want to do this on a text/tokenized field, since that will include stemming and not act the way you would want it to on partial words like "baki".

One possible way to fix this is to have a field that is a list of ingredients that is of 'fieldType' string. Then you can have your search do a prefix search (or wildcard search) of that field and be able to bring back "baking powder" from "bak". Example here

Note that the prefix search works best on string fields and won't do matches from within the string, simply from the beginning. There are ways to do more advanced auto-complete functionality than I've shown.

Upvotes: 1

Related Questions