elkon
elkon

Reputation: 51

Lucene Solr: Is it possible to index with term weights?

I would like to use Solr to index documents with term weights.

Doc1: this(w=0.3) is(w=0.4) the(w=0.1) first(w=0.7) doc(w=0.2)

Doc2: this(w=0.1) is(w=0.2) the(w=0.5) second(w=0.8) doc(w=0.1)

Note that the weight for the same term can be different for two documents.

After indexing I would like the search function to consider these weights when scoring the documents. For example, if the query is "doc", I would like Doc1 to get a higher score.

Is this possible?

Thanks!

Upvotes: 0

Views: 559

Answers (1)

elkon
elkon

Reputation: 51

This was pointed by MatsLindh, thanks!

It can be done using Payloads: https://lucene.apache.org/solr/guide/8_5/other-parsers.html#payload-score-parser

I don't recommend trying to use the example here: https://lucidworks.com/post/end-to-end-payload-example-in-solr/

Here's the solution.

1) Create a new collection:

bin/./solr create -c my_docs -s 1 -rf 2

2) Write this (based on the example) into a CSV file: (1.csv)

id,txt_dpf

1,this|0.3 is|0.4 the|0.1 first|0.7 doc|0.2

2,this|0.1 is|0.2 the|0.5 second|0.8 doc|0.1 `

3) Add the content into the collection:

bin/./post -c my_docs -type text/csv -out yes docs/csv/1.csv

4) query: localhost:8983/solr/my_docs/select?debug=results&fl=txt_dpf,score&q={!payload_score%20f=txt_dpf%20v=this%20func=max%20includeSpanScore=true}

Some important notes:

  1. The name of the field in which the weights are is IMPORTANT! it has to end with "dpf".

  2. Use IncludeSpanScore=true, otherwise your score will just be the weight.

@MatsLindh, thanks again!

Upvotes: 1

Related Questions