mbouclas
mbouclas

Reputation: 744

elastic search aggregation group values

my document structure looks like so :

{
"title" : "A title",
"ExtraFields": [
    {
        "value": "print",
        "fieldID": "5535627631efa0843554b0ea"
    }
    ,
    {
        "value": "POLYE",
        "fieldID": "5535627631efa0843554b0ec"
    }
    ,
    {
        "value": "30",
        "fieldID": "5535627631efa0843554b0ed"
    }
    ,
    {
        "value": "0",
        "fieldID": "5535627631efa0843554b0ee"
    }
    ,
    {
        "value": "0",
        "fieldID": "5535627731efa0843554b0ef"
    }
    ,
    {
        "value": "0.42",
        "fieldID": "5535627831efa0843554b0f0"
    }
    ,
    {
        "value": "40",
        "fieldID": "5535627831efa0843554b0f1"
    }
    ,
    {
        "value": "30",
        "fieldID": "5535627831efa0843554b0f2"
    }
    ,
    {
        "value": "18",
        "fieldID": "5535627831efa0843554b0f3"
    }
    ,
    {
        "value": "24",
        "fieldID": "5535627831efa0843554b0f4"
    }
]
}

The ideal output would be (best case scenario) :

[
{
    "field" : "5535627831efa0843554b0f4",
    "values" : [
        {
            "label" : "24",
            "count" : 2
        },
        {
            "label" : "18",
            "count" : 5
        }
    ]
},
{
    "field" : "5535627831efa0843554b0f3",
    "values" : [
        {
            "label" : "cott",
            "count" : 20
        },
        {
            "label" : "polye",
            "count" : 12
        }
    ]
}
]

but i could also do with a more simple one like (this is how i get it in mongodb now):

[
{
    "field" : "5535627831efa0843554b0f4",
    "value" : "24",
    "count" : 2
},
{
    "field" : "5535627831efa0843554b0f4",
    "value" : "18",
    "count" : 5
},
{
    "field" : "5535627831efa0843554b0f3",
    "value" : "cott",
    "count" : 20
},
{
    "field" : "5535627831efa0843554b0f3",
    "value" : "polye",
    "count" : 12
}
] 

How would the aggregation query look like? Any special mappings for this structure?

Upvotes: 1

Views: 426

Answers (1)

Val
Val

Reputation: 217564

In order to get what you want, you need a nested mapping for the ExtraFields sub-structure. Your document mapping would look like this (doctype is a term of my choosing to name your document type, but it can be whatever you have now):

PUT /test/_mapping/doctype
{
  "doctype": {
    "properties": {
      "title": {
        "type": "string"
      },
      "ExtraFields": {
        "type": "nested",
        "properties": {
          "value": {
            "type": "string",
            "index": "not_analyzed"
          },
          "fieldID": {
            "type": "string",
            "index": "not_analyzed"
          }
        }
      }
    }
  }
}

Then, you can index your document

PUT /test/doctype/123
{
    "title" : "A title",
    "ExtraFields": [
       ...
    ]
}

and send the following aggregation query:

POST /test/doctype/_search
{
  "size": 0,
  "aggs": {
    "fields": {
      "nested": {
        "path": "ExtraFields"
      },
      "aggs": {
        "fields": {
          "terms": {
            "field": "ExtraFields.fieldID"
          },
          "aggs": {
            "values": {
              "terms": {
                "field": "ExtraFields.value"
              }
            }
          }
        }
      }
    }
  }
}

which will yield the results you highlighted in your best case scenario, although the JSON field names in the response are named a bit differently but I guess it's ok.

Give it a try and let us know.

Upvotes: 1

Related Questions