Renukaradhya
Renukaradhya

Reputation: 862

Elastic Search: Aggregation sum on a particular field

I am new to elastic search and requesting some help. Basically I have some 2 million documents in my elastic search and the documents look like below:

{
  "_index": "flipkart",
  "_type": "PSAD_ThirdParty",
  "_id": "430001_MAM_2016-02-04",
  "_version": 1,
  "_score": 1,
  "_source": {
    "metrics": [
      {
        "id": "Metric1",
        "value": 70
      },
      {
        "id": "Metric2",
        "value": 90
      },
      {
        "id": "Metric3",
        "value": 120
      }
    ],
    "primary": true,
    "ticketId": 1,
    "pliId": 206,
    "bookedNumbers": 15000,
    "ut": 1454567400000,
    "startDate": 1451629800000,
    "endDate": 1464589800000,
    "tz": "EST"
  }
}

I want to write an aggregation query which satisfies below conditions:

1) First query based on "_index", "_type" and "pliId". 2) Do aggregation sum on metrics.value based on metrics.id = "Metric1".

Basically I need to query records based on some fields and aggregate sum on a particular metrics value based on metrics id. Please can you help me in getting my query right.

Upvotes: 1

Views: 1531

Answers (2)

Renukaradhya
Renukaradhya

Reputation: 862

Created new index: Method : PUT , URL : http://localhost:9200/google/

Body:

    {
      "mappings": {
        "PSAD_Primary": {
          "properties": {
            "metrics": {
              "type": "nested",
              "properties": {
            "id": {
              "type": "string",
              "index": "not_analyzed"
            },
            "value": {
              "type": "integer",
              "index": "not_analyzed"
            }
          }
            }
          }
        }
      }
    }

Then I inserted some 200 thousand documents and than ran the query and it worked.

Response:

{
  "took": 34,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "google",
        "_type": "PSAD_Primary",
        "_id": "383701291_MAM_2016-01-06",
        "_score": 1,
        "_source": {
          "metrics": [
            {
              "id": "Metric1",
              "value": 70
            },
            {
              "id": "Metric2",
              "value": 90
            },
            {
              "id": "Metric3",
              "value": 120
            }
          ],
          "primary": true,
          "ticketId": 1,
          "pliId": 221244,
          "bookedNumbers": 15000,
          "ut": 1452061800000,
          "startDate": 1451629800000,
          "endDate": 1464589800000,
          "tz": "EST"
        }
      }
    ]
  },
  "aggregations": {
    "by_metrics": {
      "doc_count": 3,
      "metric1_only": {
        "doc_count": 1,
        "by_metric_id": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "Metric1",
              "doc_count": 1,
              "total_delivery": {
                "value": 70
              }
            }
          ]
        }
      }
    }
  }
}

Upvotes: 0

Andrei Stefan
Andrei Stefan

Reputation: 52368

Your metrics field needs to be of type nested:

    "metrics": {
      "type": "nested",
      "properties": {
        "id": {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    }

If you want Metric1 to match, meaning upper-case letter, then as you see above the id needs to be not_analyzed.

Then, if you only want metrics.id = "Metric1" aggregations, you need something like this:

{
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "term": {
                "pliId": 206
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
    "by_metrics": {
      "nested": {
        "path": "metrics"
      },
      "aggs": {
        "metric1_only": {
          "filter": {
            "bool": {
              "must": [
                {
                  "term": {
                    "metrics.id": {
                      "value": "Metric1"
                    }
                  }
                }
              ]
            }
          },
          "aggs": {
            "by_metric_id": {
              "terms": {
                "field": "metrics.id"
              },
              "aggs": {
                "total_delivery": {
                  "sum": {
                    "field": "metrics.value"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Upvotes: 1

Related Questions