Keval
Keval

Reputation: 1859

How to prevent WordDelimiterFilterFactory to split on "."(dot)

I want to use WordDelimiterFilterFactory for requirement like :

input : 500bc

i want to search it with "500bc" or just "500"

for that i used WordDelimiterFilterFactory with :

<filter class="solr.WordDelimiterFilterFactory"  preserveOriginal="1"/>  

but now issue is it also tokenize with .(dot) like query for "6.25" will also give "25" as result

how i can stop WordDelimiterFilterFactory from tokenizing with .(dot) ?

Upvotes: 1

Views: 962

Answers (2)

Keval
Keval

Reputation: 1859

I have used

<filter class="solr.WordDelimiterFilterFactory"  generateWordParts="0"   preserveOriginal="1" types="wdfftypes.txt" />  

in wdfftypes.txt I puted

. => DIGIT

How it works : now solr will treat . as Digit and as 6.25 all are digits WordDelimiterFilterFactory will not tokenize 6.25

Upvotes: 3

spyk
spyk

Reputation: 898

Try adding the generateNumberParts="0" parameter in your filter declaration, that will prevent the filter from splitting numbers on punctuation. You can have a look here for more details: https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-WordDelimiterFilter

Upvotes: 0

Related Questions