Andrew F
Andrew F

Reputation: 595

Semi-exact (complete) match in ElasticSearch

Is there a way to require a complete (though not necessarily exact) match in ElasticSearch?

For instance, if a field has the term "I am a little teapot short and stout", I would like to match on " i am a LITTLE TeaPot short and stout! " but not just "teapot short and stout". I've tried the term filter, but that requires an actual exact match.

Upvotes: 1

Views: 191

Answers (1)

Andrei Stefan
Andrei Stefan

Reputation: 52368

If your "not necessarily exact" definition refers to uppercase/lowercase letters combination and the punctuation marks (like ! you have in your example), this would be a solution, not too simple and obvious tough:

The mapping:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_keyword_lowercase": {
          "tokenizer": "keyword",
          "filter": [
            "lowercase",
            "trim",
            "my_pattern_replace"
          ]
        }
      },
      "filter": {
        "my_pattern_replace": {
          "type": "pattern_replace",
          "pattern": "!",
          "replacement":""
        }
      }
    }
  },
  "mappings": {
    "test": {
      "properties": {
        "text": {
          "type": "string",
          "analyzer": "my_keyword_lowercase"
        }
      }
    }
  }
}

The idea here is the following:

  1. use a keyword tokenizer to keep the text as is and not to be split into tokens
  2. use the lowercase filter to get rid of the mixing uppercase/lowercase characters
  3. trim filter used to get rid of the trailing and leading whitespaces
  4. use a pattern_replace filter to get rid of the punctuation. This is like this because a keyword tokenizer won't do anything to the characters inside the text. A standard analyzer will do this, but the standard will, also, split the text whereas you need it as is

And this is the query you would use for the mapping above:

{
  "query": {
    "match": {
      "text": " i am a LITTLE TeaPot short and stout! "
    }
  }
}

Upvotes: 1

Related Questions