Reputation: 1339
Im having a bit of trouble finding some information on whats happening with my lucene searches.
(Id:gloves* Search:gloves* SpellCheckerSource:gloves*) OR
(Id:gloves Search:gloves SpellCheckerSource:gloves) OR
(Id:glove* Search:glove* SpellCheckerSource:glove*)
When I search for the above I get the following rewritten term
(() () ())
(Id:glove Search:glove SpellCheckerSource:glove)
(() ConstantScore(Search:glove*) ConstantScore(SpellCheckerSource:glove*))
This is using LUKE, I have been running the query in LUKE to try see whats going on. http://www.getopt.org/luke/
Now what I want to be able to do is search for a term ie gloves* which ends up being (() () ())
I don't understand why this gets translated like this is there and issue with my query or with my index?
LUKE tells me the structure explanation is as follows
lucene.BooleanQuery
clauses=3, maxClauses=1024
Clause 0: SHOULD
lucene.BooleanQuery
clauses=3, maxClauses=1024
Clause 0: SHOULD
lucene.BooleanQuery
clauses=0, maxClauses=1024, coord=false
Clause 1: SHOULD
lucene.BooleanQuery
clauses=0, maxClauses=1024, coord=false
Clause 2: SHOULD
lucene.BooleanQuery
clauses=0, maxClauses=1024, coord=false
Clause 1: SHOULD
lucene.BooleanQuery
clauses=3, maxClauses=1024
Clause 0: SHOULD
lucene.TermQuery
Term: field='Id' text='glove'
Clause 1: SHOULD
lucene.TermQuery
Term: field='Search' text='glove'
Clause 2: SHOULD
lucene.TermQuery
Term: field='SpellCheckerSource' text='glove'
Clause 2: SHOULD
lucene.BooleanQuery
clauses=3, maxClauses=1024
Clause 0: SHOULD
lucene.BooleanQuery
clauses=0, maxClauses=1024, coord=false
Clause 1: SHOULD
lucene.ConstantScoreQuery, ConstantScore(Search:glove*)
Filter: Search:glove*
Clause 2: SHOULD
lucene.ConstantScoreQuery, ConstantScore(SpellCheckerSource:glove*)
Filter: SpellCheckerSource:glove*
This seems strange to me on multiple levels
It should be noted everything works fine for me when i search for a term with out and s IE glove or with out a wildcard just the combination of the two seems to break the query.
Upvotes: 1
Views: 296
Reputation: 33351
This is probably happening because there are no terms in your index that match "gloves*".
When a MultiTermQuery
is rewritten, it finds the Terms that are suitable, and creates primitive queries (such as TermQuery
) on those terms. If no suitable terms are found, you'll see an empty query generated instead, like what you've shown.
A TermQuery is already a primitive query, and no rewriting is needed there. It doesn't have to enumerate terms or anything, it just runs the thing.
The other piece of this is analysis. Your query for gloves
is getting analyzed to glove
(EnglishAnalyzer
perhaps?). MultiTermQueries
(like wildcard, fuzzy, regex and prefix queries) are not analyzed by the QueryParser
. Your prefix query is trying to find " "gloves", but all those plural s
, have been stemmed away, so it doesn't find any matches.
Upvotes: 2