adarsh2109
adarsh2109

Reputation: 109

while searching for unique id it is giving multiple count

I have 5 million documents which have unique customerid as mapping id for every document. While searching for unique customer it is returning 1992 documents. And this is happening for every unique id, giving difference count as It is supposed to give one document only.

I have executed below query in kibana:

GET /my_index/_search
{ 
  "query": {
     "match": {
      "customerid": "e32e6b34-5e3f-4bb9-a3af-e89714b418ca"
      }
  }
}

It is giving me below result for unique customer id:

{
 "took" : 20,
 "timed_out" : false,
 "_shards" : {
 "total" : 1,
 "successful" : 1,
 "skipped" : 0,
 "failed" : 0
            },
 "hits" : {
 "total" : {
 "value" : 1992,
 "relation" : "eq"
    },
 "max_score" : 59.505646,
 "hits" : [
 ....
 ....
 ....

Below is mapping of my index:

{
 "pb_2409" : {
 "mappings" : {
  "dynamic_date_formats" : [
    "yyyy-MM-dd||yyyy-MM-dd HH:mm:ss.S||yyyy-MM-dd HH:mm:ss"
  ],
  "dynamic_templates" : [
    {
      "objects" : {
        "match_mapping_type" : "object",
        "mapping" : {
          "type" : "nested"
        }
      }
    }
  ],
  "properties" : {
    "customerid" : {
      "type" : "text",
      "fields" : {
        "keyword" : {
          "type" : "keyword",
          "ignore_above" : 256
        }
      }

Am I missing something?

Upvotes: 0

Views: 41

Answers (1)

Assael Azran
Assael Azran

Reputation: 2993

Change customerid type to keyword and add a normalizer to your index settings.

 "settings": {
    "analysis": {
      "normalizer": {
        "my_custom_normalizer": {
          "type": "custom",
          "filter": [
            "lowercase"
          ]
        }
      }
    }
  }

Than add a "normalizer": "my_custom_normalizer" to customerid field (in case you want to search your id incasesensitive)

"properties" : {
    "customerid" : {
      "type" : "text",
      "fields" : {
        "keyword" : {
          "type" : "keyword",
          "ignore_above" : 256,
          "normalizer": "my_custom_normalizer"
        }
      }
}

Your search query will look like

    GET /my_index/_search
    { 
      "query": {
         "term": {
          "customerid.keyword": {
             "value":"e32e6b34-5e3f-4bb9-a3af-e89714b418ca"
          }
         }
      }
    }

Your new mappings:

PUT /index
{
  "pb_2409": {
    "mappings": {
      "dynamic_date_formats": [
        "yyyy-MM-dd||yyyy-MM-dd HH:mm:ss.S||yyyy-MM-dd HH:mm:ss"
      ],
      "dynamic_templates": [
        {
          "objects": {
            "match_mapping_type": "object",
            "mapping": {
              "type": "nested"
            }
          }
        }
      ],
      "properties": {
        "customerid": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256,
              "normalizer": "my_custom_normalizer"
            }
          }
        }
      }
    },
    "settings": {
      "analysis": {
        "normalizer": {
          "my_custom_normalizer": {
            "type": "custom",
            "filter": [
              "lowercase"
            ]
          }
        }
      }
    }
  }
}

https://www.elastic.co/blog/strings-are-dead-long-live-strings Hope that helps

Upvotes: 1

Related Questions