Mat
Mat

Reputation: 596

Count total frequency of a word in a SOLR index

If I search a word in a SOLR index I get a document count for documents which contain this word, but if the word is included more times in a document, the total count is still 1 per document.

I need every returned document is counted for the number of times they have the searched word in the field.

I read Word frequency in Solr and SOLR term frequency and I enabled the Term Vector Component, but it does not work.

I configured my field in this way:

<field name="text_text" type="textgen" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true" />

But if I make the following query:

http://localhost:8888/solr/sources/select?q=text_text%3A%22Peter+Pan%22&fl=text_text&wt=json&indent=true&tv.tf

I don't have any count:

{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "fl":"text_text",
      "tv.tf":"",
      "indent":"true",
      "q":"text_text:\"Peter Pan\"",
      "wt":"json"}},
  "response":{"numFound":12,"start":0,"docs":[
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"}]
  }}

I see a "numFound" value of 12, but the word "Peter Pan" is included 20 times in all 12 documents.

Could you help me to find where I'm wrong, please?

Thank you very much!

Upvotes: 6

Views: 5151

Answers (2)

arun abraham
arun abraham

Reputation: 147

Try this structure of creating term frequency in the response:

http://localhost:8983/solr/core/select?indent=on&q=solr&fl=field,termfreq("field","term")&wt=json

Upvotes: 0

John Petrone
John Petrone

Reputation: 27507

I think first off your example won't work because "Peter Pan" is not a word or term - it's a phrase. A good discussion of the challenge of finding phrase frequency is here:

termfreq for a phrase

I would re-try your example with a single word not a phrase and see if it works for you.

Upvotes: 0

Related Questions