Reputation: 907
I have 2 versions of solr working in my machine . say SolrVer1
and SolrVer2
SolrVer1
have applied , below stemming methods on field type text_en_splitting
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" ignoreCase="true"/>
<filter class="solr.PorterStemFilterFactory" ignoreCase="true"/>
SolrVer2
have applied , below stemming methods on field type text_en_splitting
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
it works almost same for regular search , but while using wild card search then wild card search does not giving results with grammatical on SolrVer1
like searching with ray*
, SolrVer1
returns very less data as compared to SolrVer2
. when i observed the results then i found that SolrVer1
does not return data with only ray
and rays
.
I don't know where i should use SnowballPorterFilterFactory
and where i should use PorterStemFilterFactory
. and what are the pros and cons of them?
Can anybody have idea on this behavior ??
Thanks
Upvotes: 0
Views: 1248
Reputation: 52779
On wildcard and fuzzy searches, no text analysis is performed on the search word.
As no analysis is done at query time for wilcard searches and hence the stemmers would be applied during query time.
The results would be different depending upon what the stemmers are producing.
Upvotes: 0
Reputation: 11023
Need to know what the stemmers output for ray
, rays
.
Try stemming them at the Porter stemmer online tool: http://qaa.ath.cx/porter_js_demo.html. It outputs rai
! That's the reason you don't get any matches for ray*
with Porter stemmer.
And here is a tool for snowball stemmer: http://snowball.tartarus.org/demo.php.
This outputs ray
for ray
and rays
which is why you get the results.
You may want to read this for comparing the two stemmers: http://snowball.tartarus.org/texts/introduction.html
Appears like snowball was designed to address such short-comings of Porter.
Upvotes: 1