shredding
shredding

Reputation: 5601

solr does not suggest complete words

My solr installation suggest only stubs of words, not the complete words.

If I search for conductor I get results like that:

<int name="conductor">68</int>
<int name="symphoni">51</int>
<int name="no.">46</int>
<int name="rattl">28</int> 

What I would like to have would be:

and so on.

The complete generated query is:

select?fl=abstract&facet=true&facet.field=abstract&facetlimit=8&facet.mincount=1&omitHeader=true&qf=content%5E40.0+title%5E5.0+keywords%5E2.0+tagsH1%5E5.0+tagsH2H3%5E3.0+tagsH4H5H6%5E2.0+tagsInline&json.nl=map&q=conductor&start=0&rows=5

I use TYPO3 so the config xml can be found here:

https://github.com/subugoe/typo3-solr/blob/master/resources/solr/typo3cores/conf/solrconfig.xml

And the schema can be found here:

https://github.com/subugoe/typo3-solr/blob/master/resources/solr/typo3cores/conf/english/schema.xml

Upvotes: 0

Views: 1216

Answers (2)

Paige Cook
Paige Cook

Reputation: 22555

arun is correct, this issue is because you are retrieving facets for a field that is being stemmed by the index analyzers. I looked at the other fieldType definitions as supplied by TYPO3 and the textSpell fieldType looks promising.

I would suggest adding the following to the general_schema_fields.xml file..

 <field name="abstract_facet" type="textSpell" indexed="true" stored="true" />
 <copyfield source="abstract" dest="abstract_facet" />

You will need to reindex your data for these changes to take effect and then you can run the following query which should provide you with better results.

 select?fl=abstract&facet=true&facet.field=abstract_facet&facetlimit=8&facet.mincount=1
  &omitHeader=true&qf=content%5E40.0+title%5E5.0+keywords%5E2.0+tagsH1%5E5.0
  +tagsH2H3%5E3.0+tagsH4H5H6%5E2.0+tagsInline
  &json.nl=map&q=conductor&start=0&rows=5

If this does not completely satisfy your needs, I would recommend checking out the Solr Wiki - Analyzers, Tokenizers and Token Filters for more guidance on how the values are being processed and stored in the index. Ultimately, you may want to create a completely separate fieldType for use with faceting.

Upvotes: 1

arun
arun

Reputation: 11023

You have only two field types in your schema and both are doing stemming with SnowballPorterFilterFactory. You can use a copy field, which does not do stemming and use that field for getting the full words instead of stemmed words.

Upvotes: 1

Related Questions