Eva Err
Eva Err

Reputation: 11

Elasticseach: how to aggregate and calculate custom fields based on aggregation

Group by field

Id:1 Year:1999 Series title : Cool Series Episode title: ep1

Id:2 Year:2000 Series title : Cool Series Episode title: ep2

Id:3 Year:2001 Series title : Cool Series Episode title: ep3

Id:4 Year:2005 Series title : Another series Episode title: ep1

Id:5 Year:2006 Series title : Another series Episode title: ep2

How could I collapse/aggregate by series title. And in output have custom calculated fields: min-max years, count of episodes

Expected result Would be 1 object per series title.

Series title: Cool Series Custom-field-year: 1999-2001 Total-count: 3

Series title: Another series Custom-field-year: 2005-2006 Total-count: 2

Group by field which may be null

Could it be possible to use agg terms to group objects into one by series title and those that have no series title - would be a separate objects.

Id:1 Year:1999 Series title : Cool Series Episode title: ep1

Id:2 Year:2000 Series title : Cool Series Episode title: ep2

Id:3 Year:2001 Series title : Cool Series Episode title: ep3

Id:4 Year:2005 Series title : Another series Episode title: ep1

Id:5 Year:2006 Series title : Another series Episode title: ep2

Id:6 Year:2008 Series title : null Episode title: A Movie 1

Id:7 Year:2009 Series title : null Episode title: The Movie 2

Expected result in case 2 Would be 1 object per series title and seeparate movies objects

Series title: Cool Series Custom-field-year: 1999-2001 Total-count: 3

Series title: Another series Custom-field-year: 2005-2006 Total-count: 2

Series title: null Ep A Movie 1

Series title: null Ep The Movie 2

Upvotes: 0

Views: 618

Answers (2)

rabbitbr
rabbitbr

Reputation: 3261

If I understand correctly you can use the terms aggs and as sub aggs Max and Min. Use Missing parameter defines how documents that are missing a value should be treated. By default they will be ignored but it is also possible to treat them as if they had a value.

PUT idx_test/_bulk
{"index":{"_id":1}}
{"year":1999, "series_title" : "Cool Series Episode", "title": "ep1"}
{"index":{"_id":2}}
{"year":2000, "series_title" : "Cool Series Episode", "title": "ep2"}
{"index":{"_id":3}}
{"year":2001, "series_title" : "Cool Series Episode", "title": "ep4"}
{"index":{"_id":4}}
{"year":2005, "series_title" : "Another series Episode", "title": "ep1"}
{"index":{"_id":5}}
{"year":2006, "series_title" : "Another series Episode", "title": "ep1"}
{"index":{"_id":6}}
{"year":2008, "series_title" : null, "title": "A Movie 1"}
{"index":{"_id":7}}
{"year":2009, "series_title" :null, "title": "A Movie 2"}

GET idx_test/_search
{
  "size": 0,
  "aggs": {
    "group_by_series_title": {
      "terms": {
        "field": "series_title.keyword",
        "missing": "group_series_title_empty", 
        "size": 10
      },
      "aggs": {
        "min_year": {
          "min": {
            "field": "year"
          }
        },
        "max_year": {
          "max": {
            "field": "year"
          }
        }
      }
    }
  }
}

Results

"aggregations": {
    "group_by_series_title": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "Cool Series Episode",
          "doc_count": 3,
          "min_year": {
            "value": 1999
          },
          "max_year": {
            "value": 2001
          }
        },
        {
          "key": "Another series Episode",
          "doc_count": 2,
          "min_year": {
            "value": 2005
          },
          "max_year": {
            "value": 2006
          }
        },
        {
          "key": "group_series_title_empty",
          "doc_count": 2,
          "min_year": {
            "value": 2008
          },
          "max_year": {
            "value": 2009
          }
        }
      ]
    }
  }

Upvotes: 0

jaspreet chahal
jaspreet chahal

Reputation: 9099

You can use multiple aggregations to achieve this

  1. Terms aggregation

  2. Min aggregation

  3. Max aggregation

  4. Value count aggregation

{
  "size": 0,
  "aggs": {
    "NAME": {
      "terms": {
        "field": "Series title.keyword",
         "missing": "Missing title",
        "size": 10
      },
      "aggs": {
        "min-year": {
          "min": {
            "field": "Year"
          }
        },
        "max-year": {
          "max": {
            "field": "Year"
          }
        },
        "episode-count":{
           "value_count": {
             "field": "Episode title.keyword"
           }
        }
      }
    }
  }
}

Result

"buckets" : [
        {
          "key" : "Cool Series",
          "doc_count" : 3,
          "episode-count" : {
            "value" : 3
          },
          "max-year" : {
            "value" : 2001
          },
          "min-year" : {
            "value" : 1999
          }
        },
        {
          "key" : "Another series",
          "doc_count" : 2,
          "episode-count" : {
            "value" : 2
          },
          "max-year" : {
            "value" : 2006
          },
          "min-year" : {
            "value" : 2005
          }
        }
      ]

Upvotes: 0

Related Questions