Piyush N
Piyush N

Reputation: 794

How to get doc value in Elasticsearch Bucket Aggregation query instead of doc count

I have four doc in my ES index.

       {
            "_index": "my-index",
            "_type": "_doc",
            "_id": "1",
            "_score": 1.0,
            "_source": {
                "@timestamp": "2099-11-15T13:12:00",
                "message": "INFO GET /search HTTP/1.1 200 1070000",
                "user": {
                    "id": "[email protected]"
                }
            }
        },
        {
            "_index": "my-index",
            "_type": "_doc",
            "_id": "2",
            "_score": 1.0,
            "_source": {
                "@timestamp": "2099-11-15T13:15:00",
                "message": "Error GET /search HTTP/1.1 200 1070000",
                "user": {
                    "id": "[email protected]"
                }
            }
        },
       {
            "_index": "my-index",
            "_type": "_doc",
            "_id": "3",
            "_score": 1.0,
            "_source": {
                "@timestamp": "2099-11-15T13:20:00",
                "message": "INFO GET /parse HTTP/1.1 200 1070000",
                "user": {
                    "id": "[email protected]"
                }
            }
        },
        {
            "_index": "my-index",
            "_type": "_doc",
            "_id": "4",
            "_score": 1.0,
            "_source": {
                "@timestamp": "2099-11-15T13:26:00",
                "message": "Error GET /parse HTTP/1.1 200 1070000",
                "user": {
                    "id": "[email protected]"
                }
            }
        }

I am writing bucket aggregate query using filters to group all the doc in index by message type (info or error). In my above example there are 4 doc in index, two have message of type "info" and two have message of type "error".

I want to write bucket aggregate query so that I can get result group by type of message. Expected result should be two bucket each having two doc. But my query only returning doc count for each bucket instead of actual doc value.

Query that I am using is:

 {
   "size":0,
   "aggs" : {
     "messages" : {
       "filters" : {
          "filters" : {
             "info" :   { "match" : { "message" : "Info"   }},
             "error" : { "match" : { "message" : "Error"   }}
          }
        }
     }
  }
} 

and output from above query is:

       {
"took": 3,
"timed_out": false,
"_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": {
        "value": 2,
        "relation": "eq"
    },
    "max_score": null,
    "hits": []
},
"aggregations": {
    "messages": {
        "buckets": {
            "errors": {
                "doc_count": 2
            },
            "info": {
                "doc_count": 2
            }
        }
    }
}
   }

But my requirement is to get actual document with field value inside the bucket groups. Is there any way to change bucket aggregation query with filters in such a way so that I can get doc with values in each bucket?

Upvotes: 1

Views: 1685

Answers (2)

Vaisakh Rajagopal
Vaisakh Rajagopal

Reputation: 1339

To categorize document by field we can use 'term aggs'.

But if you want to see each document in the category, we have to create another sub-aggregation inside it with 'term aggs' through a unique field. Obviously group by '_id'.

Since the earlier aggs is with '_id', you will get only one document per bucket.

Then to extract the actual document data use 'top hits' aggs. The size will always be 1.

{

 // 1) Categories the data by a field

  "aggs": {
    "FIELD-LEVEL-AGGs-1": {
      "terms": {
        "field": "field-category"
      },


 // 2) Use '_id' level aggs if you want all data otherwise use top_hit directly
 //            Basically we are flattening the aggs 1

      "aggs": {
        "DOC-LEVEL-AGGs-2": {
          "terms": {
            "field": "_id"
          }
        },


 // 3) Get the actual doc data with top_hit

        "aggs": {
          "DOC-DATA-AGGs-3": {
            "top_hits": {
              "size": 1,
              "_source": {
                "include": [
                  "field-category",
                  "field-name"
                ]
              }
            }
          }
        }
      }
    }
  }
}

Upvotes: 0

Bhavya
Bhavya

Reputation: 16172

You can use top_hits aggregation, to get the corresponding documents inside the bucket group

{
  "size": 0,
  "aggs": {
    "messages": {
      "filters": {
        "filters": {
          "info": {
            "match": {
              "message": "Info"
            }
          },
          "error": {
            "match": {
              "message": "Error"
            }
          }
        }
      },
      "aggs": {
        "top_filters_hits": {
          "top_hits": {
            "_source": {
              "includes": [
                "message",
                "user.id"
              ]
            }
          }
        }
      }
    }
  }
}

Search Result will be

"aggregations": {
    "messages": {
      "buckets": {
        "error": {
          "doc_count": 2,
          "top_filters_hits": {
            "hits": {
              "total": {
                "value": 2,
                "relation": "eq"
              },
              "max_score": 1.0,
              "hits": [
                {
                  "_index": "67033379",
                  "_type": "_doc",
                  "_id": "2",
                  "_score": 1.0,
                  "_source": {
                    "message": "Error GET /search HTTP/1.1 200 1070000",
                    "user": {
                      "id": "[email protected]"
                    }
                  }
                },
                {
                  "_index": "67033379",
                  "_type": "_doc",
                  "_id": "4",
                  "_score": 1.0,
                  "_source": {
                    "message": "Error GET /parse HTTP/1.1 200 1070000",
                    "user": {
                      "id": "[email protected]"
                    }
                  }
                }
              ]
            }
          }
        },
        "info": {
          "doc_count": 2,
          "top_filters_hits": {
            "hits": {
              "total": {
                "value": 2,
                "relation": "eq"
              },
              "max_score": 1.0,
              "hits": [
                {
                  "_index": "67033379",
                  "_type": "_doc",
                  "_id": "1",
                  "_score": 1.0,
                  "_source": {
                    "message": "INFO GET /search HTTP/1.1 200 1070000",
                    "user": {
                      "id": "[email protected]"
                    }
                  }
                },
                {
                  "_index": "67033379",
                  "_type": "_doc",
                  "_id": "3",
                  "_score": 1.0,
                  "_source": {
                    "message": "INFO GET /parse HTTP/1.1 200 1070000",
                    "user": {
                      "id": "[email protected]"
                    }
                  }
                }
              ]
            }
          }
        }
      }
    }
  }

Upvotes: 3

Related Questions