user3685285
user3685285

Reputation: 6586

Do Lucene SpanQueries Tokenize automatically, or do I have to Tokenize the query myself?

I'm using the standard analyzer to store information which tokenizes on special characters, and takes out Stop Words such as 'the'. My question is:

(1) If I make a SpanQuery and I search for "The Best Stuff", but the word "the" is not stored, do I need to write code to take out this word so I am only searching for "Best Stuff", or is it automatically handled for me?

(2) Do I have to handle the lowercasing myself too?

Upvotes: 0

Views: 22

Answers (1)

femtoRgon
femtoRgon

Reputation: 33341

1 - When it comes to queries, analysis is generally handled by query parsers. As a rule, Queries don't do any analysis. As such, if you are constructin the queries yourself, including SpanQueries, yes, you will have to deal with any analysis concerns. This includes not only removing "The", but most likely "Best Stuff" will be analyzed to two terms ("best" and "stuff"), and will have to be represented as such in your SpanQuery.

2 - Yes.

Upvotes: 1

Related Questions