backtrack
backtrack

Reputation: 8154

Elasticsearch nested cardinality aggregation

I have a mapping with nested schema, i am tring to do aggregation on nested field and order by docid count.

select name, count(distinct docid) as uniqueid from table
group by name
order by uniqueid desc 

Above is what i am trying to do.

{
   "size": 0,
   "aggs": {
      "samples": {
         "nested": {
            "path": "sample"
         },
         "aggs": {
            "sample": {
               "terms": {
                  "field": "sample.name",
        "order": {
                     "DocCounts": "desc"
                  }
               },
               "aggs": {
                  "DocCounts": {
                     "cardinality": {
                        "field": "docid"              
                     }
                  }
               }
            }
         }
      }
   }
}

But in the result i am not getting the expected output

result:

"buckets": [
               {
                  "key": "xxxxx",
                  "doc_count": 173256,
                  "DocCounts": {
                     "value": 0
                  }
               },
               {
                  "key": "yyyyy",
                  "doc_count": 63,
                  "DocCounts": {
                     "value": 0
                  }
               }
]

i am getting the DocCounts = 0. This is not expected. What went wrong in my query.

Upvotes: 3

Views: 3790

Answers (3)

Tushar
Tushar

Reputation: 31

You can use reverse nested aggregation on top of Cardinality aggregation on DocCounts. This is because when nested aggregation is applied, the query runs against the nested document. So to access any field of parent document inside nested doc, reverse nested aggregation can be used. Check ES Reference for more info on this.

Your cardinality query will look like:

"aggs": {
    "internal_DocCounts": {
        "reverse_nested": { },
        "DocCounts": {
            "cardinality": {
                "field": "docid"
            }
        }
    }
}

The response will look like:

  "buckets": [
  {
    "key": "xxxxx",
    "doc_count": 173256,
    "internal_DocCounts": {
      "doc_count": 173256,
      "DocCounts": {
        "value": <some_value>
      }
    }
  },
  {
    "key": "yyyyy",
    "doc_count": 63,
    "internal_DocCounts": {
      "doc_count": 63,
      "DocCounts": {
        "value": <some_value>
      }
    }
  },
  .....

Check this similar thread

Upvotes: 0

Kamil
Kamil

Reputation: 401

In general when doing some aggregation on nested type by value from upper scope, we observed that we need to put/copy the value from upper scope on nested type when storing document.

Then in your case aggregation would look like:

"aggs": {
  "DocCounts": {
    "cardinality": {
       "field": "sample.docid"              
    }
  }
}

It works in such case at least on version 1.7 of Elasticsearch.

Upvotes: 1

core
core

Reputation: 871

I think your last nested aggregation is too much. Try to get rid of it:

{
  "size": 0,
  "aggs": {
    "samples": {
      "nested": {
        "path": "sample"
      },
      "aggs": {
        "sample": {
          "terms": {
            "field": "sample.name",
            "order": {
              "DocCounts": "desc"
            }
          },
          "DocCounts": {
            "cardinality": {
              "field": "docid"
            }
          }
        }
      }
    }
  }
}

Upvotes: 1

Related Questions