Reputation: 154
I'm writing a search feature for a database of NFL players.
The user enters a search string like "Jason Campbell" or "Campbell" or "Jason".
I'm having trouble getting the appropriate results.
Which Analyzer
should I use when indexing? Which Query
when querying? Should I distinguish between first name and last name or just index the full name string?
I'd like the following behavior:
Query: "Jason Campbell" -> Result: exact match for 1 player, Jason Campbell
Query: "Campbell" -> Result: all players with Campbell in their name
Query: "Jason" -> Result: all players with Jason in their name
Query: "Cambel" [misspelled] -> Result: all players with Campbell in their name
Upvotes: 3
Views: 3855
Reputation: 8553
StandardAnalyzer should work fine for all above queries. Your first query should be enclosed in double-quotes for an exact match, your last query would require a fuzzy query. For example you could set Cambell~0.5 and you could get Campbell as match(with the numeric value after the tilde indicating the fuzziness).
BTW I would suggest using Solr which provides features for spell-check and auto-suggest so you wouldn't have to reinvent the wheel. This is similar to Google's "did you mean..."
Upvotes: 4