Lucene - Effective text search

Question

I have an index generated by the pdfbox api class LucenePDFDocument. As the index contains only the text contents, I wish to search this index effectively.

I will search the 'contents' field with the search string, the result order must be from the most relevant to the less relevant. The code given below did displayed the files that has the words of the searched text, ex 'What is your nationality' but the results didnt contain a file containing this full sentence.

What query parser and query should i use to search in the above said scenario.

      Query query = new MultiFieldQueryParser(Version.LUCENE_30, fields,
                new StandardAnalyzer(Version.LUCENE_30))
                .parse(searchString);

      TopScoreDocCollector collector = TopScoreDocCollector.create(5,
                false);
        searcher.search(query, collector);
        ScoreDoc[] hits = collector.topDocs().scoreDocs;
        System.out.println("count " + hits.length);
        for (ScoreDoc scoreDoc : hits) {
            int docId = scoreDoc.doc;
            Document d = searcher.doc(docId);
            System.out.println(d.getField("path"));
        }

Lucene - Effective text search

Answers (1)

Related Questions