guiyu
guiyu

Reputation: 1

solr query : with the Wildcard Searches Type *

the filed define in the schema.xml :

<field name="typeDesc" type="text_general" indexed="true" stored="true"/>

The typeDesc store the values like 公立, 公立,三甲, 公立,二甲。

The question is when I query typeDesc:*三甲*, there is nothing, but when I query typeDesc:*公立* or typeDesc:*三* or typeDesc:*甲* or typeDesc:三甲, they all could find the result like 公立,三甲。 I want to know the reason.

Upvotes: 0

Views: 398

Answers (1)

MatsLindh
MatsLindh

Reputation: 52822

While I'm not too familiar with word breaking rules for kanji, I'm going to guess that the reason is that when you're doing wildcard searches, analysis for the field isn't performed. If 三 and 甲 are split into separate tokens, the wild card match will not find any token matching your search.

You can confirm this by using the analysis tab of the admin page to see which tokens an indexed term is being broken into.

Possible solutions would be to index the terms in a single string field as well and do wildcard matches against that, or use a KeywordTokenizer for your text field if you need further processing before storing the token (the keyword tokenizer will keep the text as one single token). You could also use an ngramfilter and drop the wildcards.

Upvotes: 1

Related Questions