sesmic
sesmic

Reputation: 928

Solr Search Issue

We are storing a large number of tweets and blogs feeds into Solr.

Now if the user searches for Twitter mentions like @rohit, records which just contain the word rohit are also being returned. Even if we do an exact match "@rohit", I understand this happens because of use of WordDelimiterFilterFactory which splits on special charaters.

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory

How can I force Solr to not return without "@"? I don't want to remove the WordDelimiterFilterFactory, since the splitOnCaseChange and stemEnglishPossessive are helpful.

Upvotes: 0

Views: 271

Answers (2)

harmstyler
harmstyler

Reputation: 1401

What I would do is create a new fieldType with the preserveOriginal="1" in it. Then you can create a copyfield into the old fieldType. That way you will end up with two different versions of the field that can both be searched, just because sometimes you will want to search without the '@' as well. What you can do then, if somebody searches with some special characters, like the '@' have them search the preserved original field, otherwise search the default field like normal.

Upvotes: 1

Okke Klein
Okke Klein

Reputation: 2549

If you set preserveOriginal="1" this problem should be fixed. If not your tokenizer might strip the @, so you have to chose another one like, solr.WhitespaceTokenizerFactory.

Upvotes: 2

Related Questions