Alex Pereira
Alex Pereira

Reputation: 347

filter_duplicate_text not working aggregation query

I'm trying to replicate the filter_duplicate_text example from https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-significanttext-aggregation.html.

These are my settings, mapping and documents:

PUT /ods
{
  "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 0,
      "analysis": {
          "filter": {
              "brazilian_stop": {
                  "type": "stop",
                  "stopwords": "_brazilian_"
              },
              "brazilian_stemmer": {
                  "type": "stemmer",
                  "language": "brazilian"
              }
          },
          "analyzer": {
              "brazilian": {
                  "tokenizer": "standard",
                  "filter": [
                      "lowercase",
                      "brazilian_stop",
                      "brazilian_stemmer"
                  ]
              }
          }
      }
  }
}

PUT /ods/_mapping/ods
{"properties": {"descricao": {"type": "text", "analyzer": "brazilian"},"metaodsid": {"type": "integer"}}}


POST /_bulk
{"index":{"_index":"ods","_type":"ods", "_id" : "1" }}
{ "descricao": "erradicar a pobreza","metaodsid": 1}
{"index":{"_index":"ods","_type":"ods", "_id" : "2" }}
{"descricao": "crianças que vivem na pobreza", "metaodsid": 1}
{"index":{"_index":"ods","_type":"ods", "_id" : "3" }}
{"descricao": " Melhorar a educação e adaptação, redução de impacto e da mudança do clima", "metaodsid": 2}
{"index":{"_index":"ods","_type":"ods", "_id" : "4" }}
{"descricao": "Integrar medidas da mudança do clima nas políticas", "metaodsid": 2}

And when I run the following query:

GET /ods/_search
{
  "query": {
      "bool": {
        "filter": {
          "term": {
            "metaodsid": 2
          }
        }
      }
    },
    "aggs" : {
        "my_sample" : {
            "sampler" : {
                "shard_size" : 10
            },
            "aggs": {
                "keywords" : {
                  "filter_duplicate_text": true,
                  "significant_text" : { "field" : "descricao" }

                }
            }
        }
    }
}

I receive back this error message: "Expected [START_OBJECT] under [filter_duplicate_text], but got a [VALUE_BOOLEAN] in [keywords]". I did not realize what is happening because if I remove the line "filter_duplicate_text": true, then the query works as expected.

Does anyone knows how to solve it? Thanks.

Upvotes: 0

Views: 171

Answers (1)

arun
arun

Reputation: 11023

Looking at the reference, looks like you got the filter_duplicate_text at the wrong place. It should be a sibling of field not significant_text so like:

"aggs": {
    "keywords" : {
        "significant_text" : {
            "field" : "descricao",
            "filter_duplicate_text": true
        }
    }
}

Upvotes: 2

Related Questions