JLumos
JLumos

Reputation: 127

Spring data elasticsearch Filter aggregations

I have the following JSON query that is working and doing what I want:

{
  "aggs": {
    "values": {
      "filter": { "term": { "gender": "Male" } },
      "aggs" : {
        "names" : {
            "terms" : { "field" : "name", "size":10000 }
        }
    }
    }
  },
  "size":0
  }

It is retrieving all the different names present where "gener" field is equal to "Male". It is an example of a filter aggregation in elasticseach.

I am now trying to write it with spring-data-elasticsearch and I have tried this:

    AbstractAggregationBuilder<TermsAggregationBuilder> agBuilder = AggregationBuilders.terms("name").field("name").size(10000);

    Query query = new NativeSearchQueryBuilder()
        .withQuery(QueryBuilders.matchAllQuery())
        .withFilter(QueryBuilders.termQuery("gender", "Male"))
        .withPageable(EmptyPage.INSTANCE)
        .addAggregation(agBuilder).build();

    //Execute request over the cluster
    SearchHits<Person> hits = elasticClient.search(query, Person.class);

    //Retrieve the aggregation details
    Aggregations aggs = hits.getAggregations();
    ParsedStringTerms topTags = (ParsedStringTerms) aggs.get("name");

Note that EmptyPage.INSTANCE is the same as PageRequest.of(0,0), is a custom Impl to achieve that.

Unfortunately it's not doing what I want, since it's ignoring the filter clause, it's returning names of males and females. I suspect that maybe the withQuery should contain another Query but I don't know how to pass another NativeSearchQuery to that method.

Versions I am using:

How could I write the above json query with spring-data-elasticsearch? Thanks in advance.

Upvotes: 1

Views: 1243

Answers (1)

Val
Val

Reputation: 217504

The filter actually needs to be inside withQuery().

Query query = new NativeSearchQueryBuilder()
    .withQuery(QueryBuilders.termQuery("gender", "Male"))
    ...

withFilter serves another purpose, namely it is a post_filter.

Also note that the query you are building is slightly different than the one specified at the top of your question, it looks like the one below and is actually a more optimized version, because you're filtering the documents before doing the aggregation, instead of doing the aggregation on all documents and only consider the ones matching the filter. The end result is the same, but the performance is not.

{
  "size": 0,
  "query": {
    "term": {
      "gender": "Male"
    }
  },
  "aggs": {
    "names": {
      "terms": {
        "field": "name",
        "size": 10000
      }
    }
  }
}

Upvotes: 3

Related Questions