Reputation: 5601
My solr installation suggest only stubs of words, not the complete words.
If I search for conductor I get results like that:
<int name="conductor">68</int>
<int name="symphoni">51</int>
<int name="no.">46</int>
<int name="rattl">28</int>
What I would like to have would be:
and so on.
The complete generated query is:
select?fl=abstract&facet=true&facet.field=abstract&facetlimit=8&facet.mincount=1&omitHeader=true&qf=content%5E40.0+title%5E5.0+keywords%5E2.0+tagsH1%5E5.0+tagsH2H3%5E3.0+tagsH4H5H6%5E2.0+tagsInline&json.nl=map&q=conductor&start=0&rows=5
I use TYPO3 so the config xml can be found here:
https://github.com/subugoe/typo3-solr/blob/master/resources/solr/typo3cores/conf/solrconfig.xml
And the schema can be found here:
https://github.com/subugoe/typo3-solr/blob/master/resources/solr/typo3cores/conf/english/schema.xml
Upvotes: 0
Views: 1216
Reputation: 22555
arun is correct, this issue is because you are retrieving facets for a field that is being stemmed by the index analyzers. I looked at the other fieldType definitions as supplied by TYPO3 and the textSpell fieldType looks promising.
I would suggest adding the following to the general_schema_fields.xml file..
<field name="abstract_facet" type="textSpell" indexed="true" stored="true" />
<copyfield source="abstract" dest="abstract_facet" />
You will need to reindex your data for these changes to take effect and then you can run the following query which should provide you with better results.
select?fl=abstract&facet=true&facet.field=abstract_facet&facetlimit=8&facet.mincount=1
&omitHeader=true&qf=content%5E40.0+title%5E5.0+keywords%5E2.0+tagsH1%5E5.0
+tagsH2H3%5E3.0+tagsH4H5H6%5E2.0+tagsInline
&json.nl=map&q=conductor&start=0&rows=5
If this does not completely satisfy your needs, I would recommend checking out the Solr Wiki - Analyzers, Tokenizers and Token Filters for more guidance on how the values are being processed and stored in the index. Ultimately, you may want to create a completely separate fieldType for use with faceting.
Upvotes: 1
Reputation: 11023
You have only two field types in your schema and both are doing stemming with SnowballPorterFilterFactory. You can use a copy field, which does not do stemming and use that field for getting the full words instead of stemmed words.
Upvotes: 1