Reputation: 1
I'm currently using SOLR 5.2 on a drupal website. I have SOLR installed and works well. However when I have some content say for example an article that contains punctuation such as '/' in the title.
For example 16/11. (Many of my articles begin with the day/month). SOLR fails to index these items. Instead it strips the punctuation.
I have followed the following article: http://www.prowaveconsulting.com/indexing-special-terms-using-solr/
but haven't had much luck getting this to work. I need SOLR to index the punctuation. Mainly /!;,'"
Upvotes: 0
Views: 53
Reputation: 101
Use the solr.WhiteSpaceTokenizer instead of StandardTokenizerFactory. It will tokenize only on whitespaces "For example 16/11" -> tokens = ["for","example","16/11"]
You can use solr.PatternTokenizerFactory with a pattern. It breaks the text at the specified regular expression pattern.
<fieldType name="semicolonDelimited" class="solr.TextField">
<analyzer>
<tokenizer class="solr.PatternTokenizerFactory" pattern=";\s*" />
</analyzer>
</fieldType>
Upvotes: 1