fkhan1
fkhan1

Reputation: 1

Indexing Special Terms Using Solr

I'm currently using SOLR 5.2 on a drupal website. I have SOLR installed and works well. However when I have some content say for example an article that contains punctuation such as '/' in the title.

For example 16/11. (Many of my articles begin with the day/month). SOLR fails to index these items. Instead it strips the punctuation.

I have followed the following article: http://www.prowaveconsulting.com/indexing-special-terms-using-solr/

but haven't had much luck getting this to work. I need SOLR to index the punctuation. Mainly /!;,'"

Upvotes: 0

Views: 53

Answers (1)

roland_katona
roland_katona

Reputation: 101

  1. Use the solr.WhiteSpaceTokenizer instead of StandardTokenizerFactory. It will tokenize only on whitespaces "For example 16/11" -> tokens = ["for","example","16/11"]

  2. You can use solr.PatternTokenizerFactory with a pattern. It breaks the text at the specified regular expression pattern.

<fieldType name="semicolonDelimited" class="solr.TextField"> <analyzer> <tokenizer class="solr.PatternTokenizerFactory" pattern=";\s*" /> </analyzer> </fieldType>

  1. You can write your own custom tokenizer

Upvotes: 1

Related Questions