joshlf
joshlf

Reputation: 23567

Lucene exact match query

I want to construct a Lucene query that only matches documents with exactly the terms I specify: no fewer, and no more. The "no fewer" part is easy: a BooleanQuery with all mandatory terms. However, I'm not sure how to do the "no more" part. In essence what I need is a query which says "the result documents cannot have any terms other than what I've specified in the query." Any ideas? Thanks!

Upvotes: 3

Views: 5250

Answers (1)

Artur Nowak
Artur Nowak

Reputation: 5354

I think you can approach this problem as follows:

  • you need to create an analyzer that will extract tokens, remove duplicates and then concatenate them in some order, (e.g. lexicographical). So if you have three documents:

doc1: "lorem ipsum", doc2: "lorem ipsum dolor", doc3: "lorem ipsum lorem"

It will produce the following values for them

doc1: "ipsum lorem", doc2: "dolor ipsum lorem", doc3: "ipsum lorem"

  • then create a field that is filled by this analyzer
  • finally, apply this analyzer to your query and match against this special field. So the only query term you would be using for query "lorem ipsum" would be "ipsum lorem"

The code to achieve this would be too long to fit in the answer, but I hope you get the general idea -- to create a field that you can match fully against.

Upvotes: 5

Related Questions