jacobian
jacobian

Reputation: 327

Index verification tools for Lucene

How can we know the index in Lucene is correct?

Detail

I created a simple program that created Lucene indexes and stored it in a folder. Using the diagnostic tool, Luke I could look inside an index and view the content.

I realise Lucene is a standard framework for building a search engine but I wanted to be sure that Lucene indexes every term that existed in a file.

Can I verify that the Lucene index creation is dependable? That not even a single term went missing?

Upvotes: 1

Views: 454

Answers (1)

Pascal Dimassimo
Pascal Dimassimo

Reputation: 6928

You could always build a small program that will perform the same analysis you use when indexing your content. Then, for all the terms, query your index to make sure that the document is among the results. Repeat for all the content. But personally, I would not waste time on this. If you can open your index in Luke and if you can make a couple of queries, everything is most probably fine.

Often, the real question is whether or not the analysis you did on the content will be appropriate for the queries that will be made against your index. You have to make sure that your index will have a good balance between recall and precision.

Upvotes: 3

Related Questions