Reputation: 928
We are storing a large number of tweets and blogs feeds into Solr.
Now if the user searches for Twitter mentions like @rohit, records which just contain the word rohit are also being returned. Even if we do an exact match "@rohit", I understand this happens because of use of WordDelimiterFilterFactory which splits on special charaters.
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory
How can I force Solr to not return without "@"? I don't want to remove the WordDelimiterFilterFactory, since the splitOnCaseChange and stemEnglishPossessive are helpful.
Upvotes: 0
Views: 271
Reputation: 1401
What I would do is create a new fieldType with the preserveOriginal="1" in it. Then you can create a copyfield into the old fieldType. That way you will end up with two different versions of the field that can both be searched, just because sometimes you will want to search without the '@' as well. What you can do then, if somebody searches with some special characters, like the '@' have them search the preserved original field, otherwise search the default field like normal.
Upvotes: 1
Reputation: 2549
If you set preserveOriginal="1" this problem should be fixed. If not your tokenizer might strip the @, so you have to chose another one like, solr.WhitespaceTokenizerFactory.
Upvotes: 2