MRM
MRM

Reputation: 571

lucene custom scoring

Having a document already indexed, at search i must part that document in two: first part consist of the first 100 words (tokens) and the rest of the document represents the second part. I have to score this two parts like this: the second part with 70% and the first with 30%.

EDIT 2: So i tried creating a Searcher that uses SpanPositionRangeQuery, but i must have understood SpanQuery usage all wrong because i can't get any hits (i used lukeall to verify if the words i was searching were indexed). Can someone give me a hand?

public static void search(String indexDir, String q) throws Exception
{

    Directory dir = FSDirectory.open(new File(indexDir), null);
    IndexSearcher is = new IndexSearcher(dir);

    Term term = new Term("Field", q);
    SpanPositionRangeQuery spanQuery = new SpanPositionRangeQuery(new SpanTermQuery(term), 0, 100);
    spanQuery.setBoost(0.3f);CustomRomanianAnalyzer(Version.LUCENE_35));

    long start = System.currentTimeMillis();
    TopDocs hits = is.search(spanQuery, 10);
    //TopDocs hits = is.search(query, 10);
    long end = System.currentTimeMillis();

    System.err.println("I found " + hits.totalHits + " documents (in " +
            (end - start) + " milliseconds) '" +
            q + "':");

    for (int i=0;i<hits.scoreDocs.length;i++)
    {
        ScoreDoc scoreDoc = hits.scoreDocs[i];
        Document doc = is.doc(scoreDoc.doc);
        System.out.println(doc.get("filename"));
    }

    is.close();
}

I don't know how to combine query parser with SpanPositionRangeQuery to get what i need...

Upvotes: 0

Views: 390

Answers (1)

A. Coady
A. Coady

Reputation: 57308

Yes, this can be done by setting the boost for each clause in a BooleanQuery. Using separate fields will work, but isn't strictly necessary. Lucene has a SpanPositionRangeQuery suitable for searching part of a document.

<SpanPositionRangeQuery: spanPosRange(field:term, 0, 100)^0.3>

Upvotes: 1

Related Questions