Reputation: 16022
We are evaluating a bunch of options at the moment for Fuzzy searches. One of our requirements is that mistyped words be matched. An OCR of CRA1G (with the number 1) must match 'CRAIG'.
Full text search in sql server seems to get close, but no cigar.
Are there some options besides 'FORMSOF(Inflectional, CRA1G)' that might be able to make substitutions for these types of mismatches and still get results?
I am unable to find anything similar SOUNDEX is definitely not good enough.
I might try Lucene next, which has good fuzzy search capability.
Regards
Craig.
Upvotes: 5
Views: 3271
Reputation: 1
I know it's an old thread, but many of us face the same problem till today. One good solution is searloc. It is a CLR library with zero dependencies and with many features. It supports full text search, phonetic match for all languages, keyboard match, fuzzy search, and many others. And the most important it is very fast, it needs just a few milliseconds to search million of records.
Upvotes: 0
Reputation: 8726
SQL Server 2008 supports thesauruses should get you to where you need. You would define your misspellings as synonyms. See http://msdn.microsoft.com/en-us/library/ms142491.aspx.
Edit: To my understanding, you couldn't do this using a wildcard or regular expression approach. You'd have to define common replacement set patterns for your words that you anticipate will be most commonly mis-typed.
Upvotes: 3
Reputation: 5666
This might not help you, but we've had this problem for names. In german, there are alot of similar sounding names: Bayer, Beyer, Baier would all sound the same. For this, we've created a seperate column using phonetic spelling. You might create a similar alphabet for your fuzziness, where "I" equals "1", and get similar results.
Upvotes: 2