through.a.haze
through.a.haze

Reputation: 526

How to make Solr recognize synonyms with any digits before them? e.g. # Molar as #M (and vice versa)

I need Solr to be able to recognize # Molar as #M (and vice versa) when searching as well as # Normal as #N (and vice versa).

I have many documents with 6 Molar or 1 Molar or 0.5 Molar in the name however many times these are written as 6M or 1M or 0.5M. I want Solr to be able to recognize # Molar as #M (and vice versa) when searching, besides there could be more than 1 digit (such as 12M) or as a decimal (such as 0.1M).

Can't figure out how to do this with synonyms or anything else. The Solr version is 6.2.1

Upvotes: 2

Views: 61

Answers (1)

femtoRgon
femtoRgon

Reputation: 33351

I'd probably add a PatternReplaceCharFilter to your analyzer for this.

Something like:

<analyzer>
  <charFilter class="solr.PatternReplaceCharFilterFactory"
         pattern="(\d+(.\d+)?)M" replacement="$1 Molar"/>
  <tokenizer ...
</analyzer>

CharFilters preprocess the input before tokenization happens, so you don't need to worry about the pattern spanning multiple terms (as you would with a with a PatternReplace token filter) or lowercasing in case you are also dealing with molalities.

Upvotes: 3

Related Questions