Hammerbot
Hammerbot

Reputation: 16324

Elasticsearch aggregation by field name

Imagine two documents:

[
    {
        "_id": "abc",
        "categories": {
            "category-id-1": 1,
            "category-id-2": 50
        }
    },
    {
        "_id": "def",
        "categories": {
            "category-id-1": 2
        }
    }
]

As you can see, each document can be associated with a number of categories, by setting a nested field into the categories field.

With this mapping, I should be able to request the documents from a defined category and to order them by the value set as value for this field.

My problem is that I now want to make an aggregation to count for each category the number of documents. That would give the following result for the dataset I provided:

{
    "aggregations": {
        "categories" : {
            "buckets": [
                {
                    "key": "category-id-1",
                    "doc_count": 2
                },
                {
                    "key": "category-id-2",
                    "doc_count": 1
                }
            ]
        }
    }
}

I can't find anything in the documentation to solve this problem. I'm completely new to ElasticSearch so I may be doing something wrong either on my documentation research or on my mapping choice.

Is it possible to make this kind of aggregation with my mapping? I'm using ES 6.x

EDIT: Here is the mapping for the index:

{
  "test1234": {
    "mappings": {
      "_doc": {
        "properties": {
          "categories": {
            "properties": {
              "category-id-1": {
                "type": "long"
              },
              "category-id-2": {
                "type": "long"
              }
            }
          }
        }
      }
    }
  }
}

Upvotes: 0

Views: 1756

Answers (1)

Pierre Mallet
Pierre Mallet

Reputation: 7221

The most straightforward solution is to use a new field that contains all the distinct categories of a document.

If we call this field categories_list here could be a solution :

Change the mapping to

{
  "test1234": {
    "mappings": {
      "_doc": {
        "properties": {
          "categories": {
            "properties": {
              "category-id-1": {
                "type": "long"
              },
              "category-id-2": {
                "type": "long"
              }
            }
          },
          "categories_list": {
             "type": "keyword"
          }
        }
      }
    }
  }
}

Then you need to modify your documents like this :

[
    {
        "_id": "abc",
        "categories": {
            "category-id-1": 1,
            "category-id-2": 50
        },
        "categories_list": ["category-id-1", "category-id-2"]
    },
    {
        "_id": "def",
        "categories": {
            "category-id-1": 2
        },
        "categories_list": ["category-id-1"]
    }
]

then your aggregation request should be

{
  "aggs": {
    "categories": {
      "terms": {
        "field": "categories_list",
        "size": 10
      }
    }
  }
}

and will return

"aggregations": {
    "categories": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "category-id-1",
          "doc_count": 2
        },
        {
          "key": "category-id-2",
          "doc_count": 1
        }
      ]
    }
  }

Upvotes: 2

Related Questions