HShbib
HShbib

Reputation: 1841

total number of hits lucene

I am running a program in Lucene. I get the total number of hits for each word. This means that it gets all the files containing the word I search for.

Example:

Searching for 'Amazon'
Number of hits: 2
Hit: Files\peru.txt
Hit: Files\correspondent.txt
Searching for 'business'
Number of hits: 5
Hit: Files\innovation.txt
Hit: Files\xmas.txt
Hit: Files\bp.txt
Hit: Files\symbian.txt
Hit: Files\peru.txt
Searching for 'environment'
Number of hits: 3
Hit: Files\food.txt
Hit: Files\sarkozy.txt
Hit: Files\symbian.txt

My First question is how to add the total number of hits for the whole query (2+5+3) and display them them.

My Second question is how to display the results in order ? from 2 then 3 then 5

Any suggestions would be thankful !!

Code for Searching the index and the above output:

public static void searchIndex(String searchString) throws IOException, ParseException {
        int counter = 0 ;



        System.out.println("Searching for '" + searchString + "'");
        Directory directory = FSDirectory.getDirectory(INDEX_DIRECTORY);
        IndexReader indexReader = IndexReader.open(directory);
        IndexSearcher indexSearcher = new IndexSearcher(indexReader);

        Analyzer analyzer = new StandardAnalyzer();
        QueryParser queryParser = new QueryParser(FIELD_CONTENTS, analyzer);
        Query query = queryParser.parse(searchString);
        Hits hits = indexSearcher.search(query);
        System.out.println("Number of hits: " + hits.length());



        Iterator<Hit> it = hits.iterator();
        while (it.hasNext()) {
            Hit hit = it.next();
            Document document = hit.getDocument();
            String path = document.get(path1);
            System.out.println("Hit: " + path);
        }

    }
}

Regards.

Upvotes: 0

Views: 2733

Answers (1)

Fred Foo
Fred Foo

Reputation: 363818

Use Searcher.search to get the TopDocs for each keyword, then sum/sort by the member TopDocs.totalHits.

The second parameter to search shouldn't matter if you just want statistics. If you want to find all hits, then set it to the number of documents in your index, since that's a trivial upper bound on the number of hits.

Upvotes: 1

Related Questions