samsamara
samsamara

Reputation: 4750

why lucene doesn't return all the documents in the index?

I'm using Lucene 5.3 to index a set of documents and use BooleanQuery where each term in the query is boosted by some score.

My problem is when I search the index i get a lesser number of documents as hits than that are in my index.

    System.out.println( "docs in the index = " + reader.numDocs() );
     //e.g., docs in the index = 92
    TopDocs topDocs = indexSearcher.search( q, reader.numDocs() ); //this ensures no result is omitted from the search.
    ScoreDoc[] hits = topDocs.scoreDocs;
    System.out.println( "results found: " + topDocs.totalHits )
    //e.g., results found: 44

What is the reason for this behaviour? Does lucene ignore documents with a zero score?

How do I get all the documents in the index no matter what score they have?

Upvotes: 0

Views: 1509

Answers (1)

femtoRgon
femtoRgon

Reputation: 33341

Lucene will only return results which actually match the query. If you want to get all the documents as results, you need to make sure they all match. You can do this with a MatchAllDocsQuery:

Query query = new BooleanQuery.Builder()
        .add(new BooleanClause(new MatchAllDocsQuery(), BooleanClause.Occur.MUST))
        .add(new BooleanClause(myOldQuery, BooleanClause.Occur.SHOULD))
        .build();

Upvotes: 0

Related Questions