Reputation: 1789
I'm new to ElasticSearch and need help with solving the following:
I have a set of documents that contain multiple products. I want to filter the product-property product_brand
by "Apple" and get the number of products matching the filter. The result however should be grouped by the document id which is also part of the document itself (test_id
).
Example document:
"test" : {
"test_id" : 19988,
"test_name" : "Test",
},
"products" : [
{
"product_id" : 1,
"product_brand" : "Apple"
},
{
"product_id" : 2,
"product_brand" : "Apple"
},
{
"product_id" : 3,
"product_brand" : "Samsung"
}
]
The result should be:
{
"key" : 19988,
"count" : 2
},
In SQL it would look approximately like this:
SELECT test_id, COUNT(product_id)
FROM `test`
WHERE product_brand = 'Apple'
GROUP BY test_id;
How can I achieve this?
Upvotes: 0
Views: 927
Reputation: 1228
I think this should get you pretty close:
GET /test/_search
{
"_source": {
"includes": [
"test.test_id",
"_score"
]
},
"query": {
"function_score": {
"query": {
"match": {
"products.product_brand.keyword": "Apple"
}
},
"functions": [
{
"script_score": {
"script": {
"source": "def matches=0; def products = params['_source']['products']; for(p in products){if(p.product_brand == params['brand']){matches++;}} return matches;",
"params": {
"brand": "Apple"
}
}
}
}
]
}
}
}
This approach uses a function_score, but you could also move apply this to a scripted field if you wanted to score differently. The above will only match on documents that have a child product object with the brand text exactly set to "Apple".
You just need to control the input to the two instances of apple. Alternatively, you could match on everything in the function_score query and pay attention only to the score. Your output could look like this:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 2,
"hits": [
{
"_index": "test",
"_type": "doc",
"_id": "AV99vrBpgkgblFY6zscA",
"_score": 2,
"_source": {
"test": {
"test_id": 19988
}
}
}
]
}
}
And the mappings in the index I used looked like this:
{
"test": {
"mappings": {
"doc": {
"properties": {
"products": {
"properties": {
"product_brand": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"product_id": {
"type": "long"
}
}
},
"test": {
"properties": {
"test_id": {
"type": "long"
},
"test_name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
}
Upvotes: 1