Alexander Solonik
Alexander Solonik

Reputation: 10230

How to match partial words in elastic search text search

I have a field name in my elastic search with a value of Single V

Now if i search it with a value of S or Sing , i don't get no result , but if i enter a full value Single , then i get the result Single V, the query i am using is as following :-

{
  "query": {
    "match": {
      "name": "singl"
    }
  },
  "sort": []
}

This gives me no results , do i need to change the mapping/setting for name or analyzer ?

EDIT:-

I am trying to create the following index with the following mapping/setting

PUT my_cars
{
  "settings": {
    "analysis": {
      "normalizer": {
        "sortable": {
          "filter": ["lowercase"]
        }
      },
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        },
        "tokenizer": {
          "my_tokenizer": {
            "type": "ngram",
            "min_gram": 1,
            "max_gram": 36,
            "token_chars": [
              "letter"
            ]
          }
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "my_analyzer",
        "fields": {
          "keyword": {
            "type": "keyword",
            "normalizer": "sortable"
          }
        }
      }
    }
  }
}

But i get the following error

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "analyzer [tokenizer] must specify either an analyzer type, or a tokenizer"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "analyzer [tokenizer] must specify either an analyzer type, or a tokenizer"
  },
  "status" : 400
}

Upvotes: 1

Views: 4438

Answers (2)

Bhavya
Bhavya

Reputation: 16172

Elasticsearch by default uses a standard analyzer for the text field if no analyzer is specified. This will tokenize "Single V" into "single" and "v". Due to this, you are getting the result for "Single" and not for the other terms.

If you want to do a partial search, you can use edge n-gram tokenizer or a Wildcard query

The mapping for the Edge n-gram tokenizer would be

{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 6,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    },
    "max_ngram_diff": 10
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "my_analyzer"
      }
    }
  }
}

Update 1:

In the index mapping given above, there is one bracket } missing. Modify your index mapping as shown below

{
  "settings": {
    "analysis": {
      "normalizer": {
        "sortable": {
          "filter": [
            "lowercase"
          ]
        }
      },
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }                                 
      },                                    // note this
      "tokenizer": {
        "my_tokenizer": {
          "type": "ngram",
          "min_gram": 1,
          "max_gram": 36,
          "token_chars": [
            "letter"
          ]
        }
      }
    },
    "max_ngram_diff": 50
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "my_analyzer",
        "fields": {
          "keyword": {
            "type": "keyword",
            "normalizer": "sortable"
          }
        }
      }
    }
  }
}

Upvotes: 3

Tushar Shahi
Tushar Shahi

Reputation: 20441

This is because of the default analyzer. The field is broken into tokens because of the analyzer - [Single,V]. Match query will try to find an exact search of any of the query tokens. Since you are only passing Singl that will be the only token, which is not matching any of the two tokens which are saved in the DB.

{
  "query": {
    "wildcard": {
      "user.id": {
        "name": "*singl*"
      }
    }
  }
}

You can use wildcard queries

Upvotes: 0

Related Questions