Alb
Alb

Reputation: 3681

How to parse lucene String queries that use spanNot and spanNear

If I have the following lucene query as a String, is it possible to use an existing lucene library to parse it?

+spanNot(spanNear([text:word1, text:word2], 10, true), text:mydelimiter)

I'm using lucene 3.0.0. I've tried the QueryParser in core and it gies no error but creates an incorrect BooleanQuery. I've also tried the StandardQueryParser (also gives BooleanQuery) in lucene-contrib and org.apache.lucene.queryParser.surround.parser.QueryParser which results in an error (Encountered ""(""(""....)

Is my only choice to construct the equivalent query in code?

(FYI my overall goal is to find terms in any order in the same sentence, by replacing sentence ending periods with "mydelimiter" in the document before indexing.)

Upvotes: 0

Views: 912

Answers (1)

milan
milan

Reputation: 12420

From Lucene in action book:

QueryParser doesn’t support any of the SpanQuery types, but the surround QueryParser in Lucene’s contrib modules does.

Here's an example of a surround parser syntax:

aa NOT bb NOT cc – same effect as: (aa NOT bb) NOT cc
and(aa,bb,cc) – aa and bb and cc
99w(aa,bb,cc) - ordered span query with slop 98
99n(aa,bb,cc) – unordered span query with slop 98

Looks like you'll have to either change your syntax to conform to this one and use the surround parser or extend QueryParser.

Upvotes: 0

Related Questions