ItayB
ItayB

Reputation: 11337

Elasticsearch support both case sensitive & insensitive

Setup: Elasticsearch 6.3

I have an index that represents the products catalog. 

Every document contains one product's data.

One of the fields called categories which is an array of strings - List of relevant categories.

99.9% of the queries are: give me the products that match categories A, B and C. The query about is case insensitive, thus categories mapping looks like:

"categories": {
    "type": "keyword",
    "normalizer": "lowercase_normalizer"
}

For reporting (0.1% of all queries) I need to return a list of all possible categories case sensitive!

Consider the following documents:

"_id": "product1",
"_source": {
    "categories": [
        "WOMEN",
        "Footwear"
     ]
}

"_id": "product2",
"_source": {
    "categories": [
        "Men",
        "Footwear"
     ]
}

Running the following query:

{
  "size": 0,
  "aggs": {
    "categories": {
      "terms": {
        "field": "categories",
        "size": 100
      }
    }
  }
}

return:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 40453,
    "max_score": 0,
    "hits": [

    ]
  },
  "aggregations": { 
    "sterms#categories": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 12453,
      "buckets": [
        {
          "key": "men",
          "doc_count": 27049
        },
        {
          "key": "women",
          "doc_count": 21332
        },
       .........
      ]
    }
  }
}

Is there a way to return the categories with their case sensitivity (as stored in the documents)? I'm interested in ["WOMEN", "Men"] in this query's result.

The question in Elasticsearch discuss forum

Thanks, Itay

Upvotes: 0

Views: 867

Answers (1)

Pierre Mallet
Pierre Mallet

Reputation: 7221

you need to configure a field in your property that will not use any normalizer :

Documentation on fields

Something like

"categories": {
    "type": "keyword",
    "normalizer": "lowercase_normalizer",
    "fields": {
        "case_sensitive": {
            "type": "keyword"
        }
    }
}

Then make your aggregation on this field :

{
  "size": 0,
  "aggs": {
    "categories": {
      "terms": {
        "field": "categories.case_sensitive",
        "size": 100
      }
    }
  }
}

Upvotes: 1

Related Questions