ahmet2106
ahmet2106

Reputation: 5007

SOLR Query works not properly with field starts with

I am new to Solr, I have developed a grouped search which should group Search results by object_class (each database table is has a object like User, Artist, ...).

Now I was trying to build a search, which should search in fields like headline, subtitle, content and biography.

For each object I have a different search query (because with solr you are not able to implement different sort orders when grouped result).

The Problem: There is an Artist with headline "Cuebrick". Now normally this should be find when searching for Cueb or even headline:Cueb* but this doesn't work.

Like in screenshots you can see I am searching for Cueb, headline:cueb* and headline:cuebrick with object_class:Artist ( ... AND ... ).

Why arent my "like" queries working?

Query: cueb AND object_class:Artist

query1 not working

Query: headline:cueb* AND object_class:Artist

query2 not working

Query: headline:cuebrick AND object_class:Artist

query3 the right result

The important part of my schema looks like this:

<field name="headline" type="text_de" indexed="true" stored="true" stripHTML="true" />
(... same for content, subtitle and biography)


<defaultSearchField>text</defaultSearchField>

<copyField source="headline" dest="text"/>
<copyField source="content" dest="text"/>
<copyField source="keywords" dest="text"/>
<copyField source="subtitle" dest="text"/>
<copyField source="biography" dest="text"/>

and here my text_de definition (changed it right now, do i have to reindex? restart didnt change anything):

<fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_de.txt" enablePositionIncrements="true" />
    <filter class="solr.WordDelimiterFilterFactory" splitOnCaseChange="1" splitOnNumerics="1" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" />
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
    <filter class="solr.SnowballPorterFilterFactory" language="German2" />
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_de.txt" enablePositionIncrements="true" />
    <filter class="solr.WordDelimiterFilterFactory" splitOnCaseChange="1" splitOnNumerics="1" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" />
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
    <filter class="solr.SnowballPorterFilterFactory" language="German2" />
  </analyzer>
</fieldType>

Upvotes: 0

Views: 1604

Answers (1)

BitByter GS
BitByter GS

Reputation: 999

Token "cuebrick" will be stored in your index as "cubrick" due to filter class="solr.SnowballPorterFilterFactory" language="German2".

Your query headline:cueb* is a wildcard query. Wildcard query does not perform any analysis on the text you provide as the query. So it will search for token with prefix "cueb" and cant find any match as your indexed token is "cubrick".

Change your query to headline:cub* and check results.

Upvotes: 3

Related Questions