Reputation: 133
I have basically the same problem as discussed here: Solr wildcard query with whitespace, but this question was not answered.
I'm using a wildcard in a filter query on a field called "brand."
I'm having trouble when the brand name has whitespace in it. For instance, filtering the brand "Lexington" works fine when I say fq={!tag=brand}brand:Lexing*n. A multi-word brand like "Athentic Models" causes problems, however. It seems double quotes must be included around the name.
When there are "s, *s don't do anything, ie brand:"Athentic Mode*" or brand:"Lexingt*", won't match anything. Without double quotes, it does work to say brand:Authen*, with no quotes and no space, and that will match Authentic Models. But once whitespace is included in the brand name, it seems to only consider the string up to the first space when matching.
The brand field is of type
<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
which is not whitespace tokenized, to my understanding. It is populated with a copyField from a whitespace tokenized field, though.
Is there something I can do to stop Solr from tokenizing the filter query without using double quotes?
Upvotes: 10
Views: 12995
Reputation: 1117
Just like Rob said in his answer, I've posted an answer on my own on the question he linked to.
All you need to do is escape the space in your query (as in, customer_name:Pop *Tart
--> customer_name:Pop\ *Tart
).
From my experience, this method works no matter where you place the wildcard, which is backed up by how Solr claims that something like:
customer_name:Pop\ *Tart
Is parsed as:
customer_name:Pop *Tart
Upvotes: 14
Reputation: 1683
I have added a possible solution back on the original question Solr wildcard query with whitespace
Note this ONLY works with trailing wildcards. I know this question example uses the wildcard within the string, but it serves to answer a specific case of the question in point.
Basically it amounts to using the FieldQParserPlugin query parser. Check my post on the original question for more details so I don't get scorn for repeating myself.
Upvotes: 0
Reputation: 1318
Try to change the type from string to something like text. String type is not tokenized so when there is a whitespace in a string field, it will try to match your query, including the whitespace in the field.
in the default schema file you can see this line just above the string field type
<!-- The StrField type is not analyzed, but indexed/stored verbatim. -->
using a text type should fix your problem, like text_general or a similar one.
Upvotes: 2