extesy
extesy

Reputation: 93

Elasticsearch aggregation doesn't work with nested-type fields

I can't make elasticsearch aggregation+filter to work with nested fields. The data schema (relevant part) is like this:

"mappings": {
  "rb": {
    "properties": {
      "project": {
        "type": "nested",
        "properties": {
          "age": {
            "type": "long"
          },
          "name": {
            "type": "string",
            "index": "not_analyzed"
          }
        }
      }    
    }
  }
}

Essentially "rb" object contains a nested field called "project" which contains two more fields - "name" and "age". Query I'm running:

"aggs": {
  "root": {
    "aggs": {
      "group": {
        "aggs": {
          "filtered": {
            "aggs": {
              "order": {
                "percentiles": {
                  "field": "project.age",
                  "percents": ["50"]
                }
              }
            },
            "filter": {
              "range": {
                "last_updated": {
                  "gte": "2015-01-01",
                  "lt": "2015-07-01"
                }
              }
            }
          }
        },
        "terms": {
          "field": "project.name",
          "min_doc_count": 5,
          "order": {
            "filtered>order.50": "asc"
          },
          "shard_size": 10,
          "size": 10
        }
      }
    },
    "nested": {
      "path": "project"
    }
  }
}

This query is supposed to produce top 10 projects (project.name field) which match the date filter, sorted by their median age, ignoring projects with less than 5 mentions in the database. Median should be calculated only for projects matching the filter (date range).

Despite having more than a hundred thousands objects in the database, this query produces empty list. No errors, just empty response. I've tried it both on ES 1.6 and ES 2.0-beta.

Upvotes: 6

Views: 9330

Answers (1)

Val
Val

Reputation: 217304

I've re-organized your aggregation query a bit and I could get some results showing up. The main point is type since you are aggregating around a nested type, I took out the filter aggregation on the last_updated field and moved it up the hierarchy as the first aggregation. Then comes the nested aggregation on the project field and finally the terms and the percentile.

That seems to work out pretty well. Please try.

{
  "size": 0,
  "aggs": {
    "filtered": {
      "filter": {
        "range": {
          "last_updated": {
            "gte": "2015-01-01",
            "lt": "2015-07-01"
          }
        }
      },
      "aggs": {
        "root": {
          "nested": {
            "path": "project"
          },
          "aggs": {
            "group": {
              "terms": {
                "field": "project.name",
                "min_doc_count": 5,
                "shard_size": 10,
                "order": {
                  "order.50": "asc"
                },
                "size": 10
              },
              "aggs": {
                "order": {
                  "percentiles": {
                    "field": "project.age",
                    "percents": [
                      "50"
                    ]
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Upvotes: 8

Related Questions