Elasticsearch re-indexing same document causing score changes

Question

We have created an index with the document

POST sample-index-test/_doc/1
{
    "first_name": "James",
    "last_name" : "Osaka"
}

there is only one document in the index, when we are performing _explain api using match query on the index

GET sample-index-test/_explain/1
{
  "query": {
    "match": {
      "first_name": "James"
    }
  }
}

Explain api returns below details

score : 0.2876821
number of documents containing term : 1
total number of documents with field : 1

{
  "_index" : "sample-index-test",
  "_type" : "_doc",
  "_id" : "1",
  "matched" : true,
  "explanation" : {
    "value" : 0.2876821,
    "description" : "weight(first_name:james in 0) [PerFieldSimilarity], result of:",
    "details" : [
      {
        "value" : 0.2876821,
        "description" : "score(freq=1.0), computed as boost * idf * tf from:",
        "details" : [
          {
            "value" : 2.2,
            "description" : "boost",
            "details" : [ ]
          },
          {
            "value" : 0.2876821,
            "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
            "details" : [
              {
                "value" : 1,
                "description" : "n, number of documents containing term",
                "details" : [ ]
              },
              {
                "value" : 1,
                "description" : "N, total number of documents with field",
                "details" : [ ]
              }
            ]
          },
          {
            "value" : 0.45454544,
            "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
            "details" : [
              {
                "value" : 1.0,
                "description" : "freq, occurrences of term within document",
                "details" : [ ]
              },
              {
                "value" : 1.2,
                "description" : "k1, term saturation parameter",
                "details" : [ ]
              },
              {
                "value" : 0.75,
                "description" : "b, length normalization parameter",
                "details" : [ ]
              },
              {
                "value" : 1.0,
                "description" : "dl, length of field",
                "details" : [ ]
              },
              {
                "value" : 1.0,
                "description" : "avgdl, average length of field",
                "details" : [ ]
              }
            ]
          }
        ]
      }
    ]
  }
}

Now, running the same index request multiple times in the span of seconds

POST sample-index-test/_doc/1
{
    "first_name": "James",
    "last_name" : "Cena"
}

Again running the same _explain api returns a different score with number of documents containing term and total number of documents with field.

score : 0.046520013
number of documents containing term : 10
total number of documents with field : 10

{
  "_index" : "sample-index-test",
  "_type" : "_doc",
  "_id" : "1",
  "matched" : true,
  "explanation" : {
    "value" : 0.046520013,
    "description" : "weight(first_name:james in 0) [PerFieldSimilarity], result of:",
    "details" : [
      {
        "value" : 0.046520013,
        "description" : "score(freq=1.0), computed as boost * idf * tf from:",
        "details" : [
          {
            "value" : 2.2,
            "description" : "boost",
            "details" : [ ]
          },
          {
            "value" : 0.046520017,
            "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
            "details" : [
              {
                "value" : 10,
                "description" : "n, number of documents containing term",
                "details" : [ ]
              },
              {
                "value" : 10,
                "description" : "N, total number of documents with field",
                "details" : [ ]
              }
            ]
          },
          {
            "value" : 0.45454544,
            "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
            "details" : [
              {
                "value" : 1.0,
                "description" : "freq, occurrences of term within document",
                "details" : [ ]
              },
              {
                "value" : 1.2,
                "description" : "k1, term saturation parameter",
                "details" : [ ]
              },
              {
                "value" : 0.75,
                "description" : "b, length normalization parameter",
                "details" : [ ]
              },
              {
                "value" : 1.0,
                "description" : "dl, length of field",
                "details" : [ ]
              },
              {
                "value" : 1.0,
                "description" : "avgdl, average length of field",
                "details" : [ ]
              }
            ]
          }
        ]
      }
    ]
  }
}

Why elasticsearch increasing count of total number of documents with field and number of documents containing term, same time index only contains a single document?

Elasticsearch re-indexing same document causing score changes

Answers (1)

Related Questions