Cássio
Cássio

Reputation: 329

Querying an analysed field doesn't work without informing then analyser in the query

I'm using elasticsearch 7.14 and I want to perform a query using a custom analyzer. This is the index:

PUT /my-index-001
{
  "settings": {
    "index": {
      "number_of_shards": 3,
      "number_of_replicas": 0
    },
    "analysis": {
      "analyzer": {
        "alphanumeric_only_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "char_filter": [
            "alphanumeric_only_filter"
          ],
          "filter": [
            "lowercase"
          ]
        }
      },
      "char_filter": {
        "alphanumeric_only_filter": {
          "type": "pattern_replace",
          "pattern": "[^A-Za-z0-9]",
          "replacement": ""
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "myField": {
        "type": "text",
        "analyzer": "alphanumeric_only_analyzer",
        "search_analyzer": "alphanumeric_only_analyzer"
      }
    }
  }
}

And 2 documents to test the queries:

POST /my-index-001/_doc
{
    "myField":"asd-9887"
}

POST /my-index-001/_doc
{
    "myField":"asd 9887"
}

Checking the analyzer, it works as expected, resulting the token "asd9887"

POST my-index-001/_analyze
{
  "analyzer": "alphanumeric_only_analyzer",
  "text": "aSd 9887"
}

Since everything is there and looks fine, let's start querying:

Query1: This finds both documents:

GET /my-index-001/_search
{
  "query": {
    "term": {
      "myField": "asd9887"
    }
  }
}

Query2: This doesn't find any document

GET /my-index-001/_search
{
  "query": {
    "term": {
      "myField": "asd 9887"
    }
  }
}

Query3: This finds both documents, but I had to inform which analyser to use:

GET /my-index-001/_search
{
  "query": {
    "match": {
      "myField": {
        "query": "asd 9887",
        "analyzer": "alphanumeric_only_analyzer"
      }
    }
  }
}

Why should I be required to do it this way, since I created the mapping informing search_analyzer as alphanumeric_only_analyzer?

There is a way to make Query2 work as is? I don't want my users having to know analyzer names, as well as I want them to be able to find both documents when querying any value that, after analyzed, matches the analyzed document value.

Upvotes: 0

Views: 23

Answers (1)

jaspreet chahal
jaspreet chahal

Reputation: 9099

Use match query instead of term query

The term query does not analyze the search term. The term query only searches for the exact term you provide. So it is searching for "asd 9887" in your tokens. Match query analyzes search term using same analyzer as field resulting in creation of same tokens. So "asd 9887" is converted to "asd9887" while searching

Upvotes: 1

Related Questions