Reputation: 11337
Setup: Elasticsearch 6.3
I have an index that represents the products catalog.
Every document contains one product's data.
One of the fields called categories
which is an array of strings - List of relevant categories.
99.9% of the queries are: give me the products that match categories A, B and C. The query about is case insensitive, thus categories mapping looks like:
"categories": {
"type": "keyword",
"normalizer": "lowercase_normalizer"
}
For reporting (0.1% of all queries) I need to return a list of all possible categories case sensitive!
Consider the following documents:
"_id": "product1",
"_source": {
"categories": [
"WOMEN",
"Footwear"
]
}
"_id": "product2",
"_source": {
"categories": [
"Men",
"Footwear"
]
}
Running the following query:
{
"size": 0,
"aggs": {
"categories": {
"terms": {
"field": "categories",
"size": 100
}
}
}
}
return:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 40453,
"max_score": 0,
"hits": [
]
},
"aggregations": {
"sterms#categories": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 12453,
"buckets": [
{
"key": "men",
"doc_count": 27049
},
{
"key": "women",
"doc_count": 21332
},
.........
]
}
}
}
Is there a way to return the categories with their case sensitivity (as stored in the documents)? I'm interested in ["WOMEN", "Men"]
in this query's result.
The question in Elasticsearch discuss forum
Thanks, Itay
Upvotes: 0
Views: 867
Reputation: 7221
you need to configure a field in your property that will not use any normalizer :
Something like
"categories": {
"type": "keyword",
"normalizer": "lowercase_normalizer",
"fields": {
"case_sensitive": {
"type": "keyword"
}
}
}
Then make your aggregation on this field :
{
"size": 0,
"aggs": {
"categories": {
"terms": {
"field": "categories.case_sensitive",
"size": 100
}
}
}
}
Upvotes: 1