Reputation: 2964
I own a ~1 million records MySQL table. I will need soon to add search in my Rails 3.x app. I want the search to be fuzzy.
Actually, I use a plugin (rails-fuzzy-search) for another table but it's only 3000 records. This plugin create trigrams in another table (25000 trigrams for the 3000 records table).
Well, I can't use this method for my 1 million records table else my trigrams table will be maybe 100 millions records !
I see some gems:
Or the use of Sphinx and Thinking Sphinx + addons.
I don't know what is the best solution for better performances.
The search will be set for two fields of my table.
Upvotes: 1
Views: 1419
Reputation: 563
Depending on your database, you may be able to enable database-specific features to achieve some "fuzzy matching" (whatever that means exactly).
MySQL has "Natural Language Mode", which is explained here.
This is fairly easy to implement. First add an index in a migration like so:
add_index :products, [:name, :description], type: :fulltext, name: :fulltext_index_products_on_name_and_description
Then you can use the MATCH clause in your query like so:
scope :search, ->(query) {
where("MATCH (products.name, products.description) AGAINST (? IN NATURAL LANGUAGE MODE)", query)
# MySQL should sort by relevance, so make sure not to override this with your own order clause.
}
Be aware you may get unexpected results testing on a small dataset, it's best to try it with full data. You may also find "BOOLEAN MODE" worth looking into.
Upvotes: 0
Reputation: 41884
some searching around revealed fuzzily gem:
Anecdotical benchmark: against our whole Geonames-derived table of locations (3.2M records, about 1GB of data), on my development machine (a 2011 MacBook Pro)
searching for the top 10 matching records takes 6ms ±1 preparing the index for all records takes about 10min the DB query overhead when changing a record is at 3ms ±2 the memory overhead (footprint of the trigrams table index) is about 300MB
Also, check out Solr and Sunspot
Upvotes: 2