Lucene QueryParser vs. TermQuery

Question

I'm currently unsure about the behavior of the QueryParser vs. TermQuery in Lucene; I'm using Lucene 3.6.

In my example I'm try the following examples over the same index, where the field in question is set at Field.Store.NO and Field.Index.NOT_ANALYZED_NO_NORMS.

Query q1 = new TermQuery(new Term("names", "test three"));

QueryParser q2p = new QueryParser(GenericIndexer.LUCENE_VERSION, "names", someAnalyzer);
Query q2 = q2p.parse("names:test three");
Query q3 = q2p.parse("names:\"test three\"");

In both cases q2 and q3 I'm unable to reproduce the same syntax as q1; by printing out the queries, I can see that:

q1 = names:test three
q2 = names:test names:three
q3 = names:"test three"

Due to this difference queries q2 and q3 return no results, while query q1 return the expected result.

Question: Is there a way to have the query parser to reproduce the same query as the TermQuery or am I missing some fundamental Lucene's notion here?

Note: for the QueryParser, the analyzer is the same one used during indexing, although I'm not sure how relevant this information is.

femtoRgon · Accepted Answer

With your TermQuery, you are producing a single term test three. Since this field is not analyzed, producing a single term is correct.

In q2, you are seeing two separate terms due to the query parser's syntax. What is is really doing, is paring to a query like; names:test defaultField:three, though it's not obvious since your default field is also "names"

In q3 (where you're note is, indeed, quite relevant!), you produce a phrase query, which isn't quite the same as the TermQuery you provided in q1, but with the right analyzer, it can be equivalent. PhraseQueries are analyzed, and I'm guessing the analyzer being used by the query parser there is StandardAnalyzer, or something like it. The difference is in what the terms look like:

Terms analyzed by StandardAnalyzer: test - three
Terms in unanalyzed field: test three

So, there are no identical terms to match between the two representations. Instead, try using KeywordAnalyzer, which is effectively that same as using an un-analyzed field.

You generally want to make sure you use the same analyzer in your QueryParser that you do to analyze your documents, with KeywordAnalyzer being the de-facto equivalent analyzer for an un-analyzed field.

Lucene QueryParser vs. TermQuery

Answers (1)

Related Questions