Reputation: 18104
When dealing with exact matches I'm given a real world query like this:
"not in education, employment, or training"
Converted to a Lucene query with stopwords removed gives:
+Content:"? ? education employment ? training"
Here's a more contrived example:
"there is no such thing"
Converted to a Lucene query with stopwords removed gives:
+Content:"? ? ? ? thing"
My goal is to have searches like these match only the exact match as the user entered it.
Could one solution be to clear the stopwords list? would this have adverse affects? if so what? (my google-fu failed)
Upvotes: 0
Views: 665
Reputation: 811
This all depends on the analyzer you are using. The StandardAnalyzer
uses Stop words and strips them out, in fact the StopAnalyzer
is where the StandardAnalyzer
gets its stop words from.
Use the WhitespaceAnalyzer
or create your own by inheriting from one that most closely suits your needs and modify it to be what you want.
Alternatively, if you like the StandardAnalyzer
you can new one up with a custom stop word list:
//This is what the default stop word list is in case you want to use or filter this
var defaultStopWords = StopAnalyzer.ENGLISH_STOP_WORDS_SET;
//create a new StandardAnalyzer with custom stop words
var sa = new StandardAnalyzer(
Version.LUCENE_29, //depends on your version
new HashSet<string> //pass in your own stop word list
{
"hello",
"world"
});
Upvotes: 1