Tony
Tony

Reputation: 10208

Solr configuration on Heroku

I am using WebSolr Cobalt on Heroku. The search is working if I search whether for the first letter or the full word, but no partial parts of the word.

Any help?

Upvotes: 1

Views: 730

Answers (1)

Aaron Henderson
Aaron Henderson

Reputation: 1880

To enable partial word searching

you must edit your local schema.xml file, usually under solr/config, to add either:

  1. NGramFilterFactory
  2. EdgeNGramFilterFactory

Here's what mine looks like - sample schema.xml

EdgeNGram

I went with the EdgeN option. It doesn't allow for searching in the middle of words, but it does allow partial word search starting from the beginning of the word. This cuts way down on false positives / matches you don't want, performs better, and is usually not missed by the users. Also, I like the minGramSize=2 so you must enter a minimum of 2 characters. Some folks set this to 3.

Once your local is setup and working, you must edit the schema.xml used by websolr, otherwise you will get the default behavior which requires the full-word to be entered even if you have full text searching configured for your models.

To edit the websolr schema.xml

  1. Go to the Heroku online dashboard for your app
  2. Go to the resources tab, then click on the Websolr add-on
  3. Click the default link under Indexes
  4. Click on the Advanced Configuration link
  5. Paste in your schema.xml from your local, including the config for your Ngram tokenizer of choice (mentioned above). Save.
  6. Copy the link in the "Configure your Heroku application" box, then paste it into terminal to set your WEBSOLR_URL link in your heroku config.
  7. Click the Index Status link to get nifty stats and see if you are running fast or slow.
  8. Reindex everything

heroku run rake sunspot:reindex[5000]

  • Don't use heroku run rake sunspot:solr:reindex - it is deprecated, accepts no parameters and is WAY slower
  • Default batch size is 50, most people suggest using 1000, but I've seen significantly faster results (1000 rows per second as opposed to around 500 rps) by bumping it up to 5000+

Take it to the next level

5 ways to speed up indexing

Upvotes: 2

Related Questions