Reputation: 199

Elasticsearch prefix query

How to do a prefix search in Elasticsearch ?

For example if I have the following indexed documents:

[{
        "id": "1",
        "key": "abc",
        "foo": [1, 2, 3]
    },
    {
        "id": "2",
        "key": "ab",
        "foo": [4]
    },
    {
        "id": "3",
        "key": "xyz",
        "foo": [9, 10]
    },
    {
        "id": "4",
        "key": "abcd",
        "foo": [12]
    }
]

Now I want to have a query on attribute "key" with value "abcdef".

I expect the following documents to match the query.

document id	matched	reason
"1"	YES	"abc" is a prefix of "abcdef"
"2"	YES	"ab" is a prefix of "abcdef"
"3"	NO	"xyz" is not a prefix of "abcdef"
"4"	YES	"abcd" is a prefix of "abcdef"

Upvotes: 0

Answers (2)

Val

Reputation: 217294

Using an edge-ngram tokenizer (or token filter) is correct but you should only apply it at search time since the documents already contain the prefixes you're searching.

So you're index settings and mappings should look like this:

PUT test
{
  "settings": {
    "analysis": {
      "analyzer": {
        "prefix_analyzer": {
          "tokenizer": "prefix_tokenizer"
        }
      },
      "tokenizer": {
        "prefix_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 6,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    },
    "index": {
      "max_ngram_diff": 10
    }
  },
  "mappings": {
    "properties": {
      "key": {
        "type": "text",
        "analyzer": "keyword",
        "search_analyzer": "prefix_analyzer"
      }
    }
  }
}

Then, your search query can look like this:

POST test/_search
{
  "query": {
    "match": {
      "key": "abcdef"
    }
  }
}

What is going to happen is that the input token abcdef will get tokenized into:

ab
abc
abcd
abcde
abcdef

In the results you'll get:

The first token will match document 2
The second token will match document 1
The third token will match document 4

Upvotes: 1

Bhavya

Reputation: 16172

You can use edge_ngram tokenizer to query on attribute "key" with value "abcdef"

Adding a working example

Index Mapping:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 6,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    },
    "max_ngram_diff": 10
  },
  "mappings": {
    "properties": {
      "key": {
        "type": "text",
        "analyzer": "my_analyzer"
      }
    }
  }
}

Search Query:

{
  "query": {
   "match":{
     "key":"abcdef"
   }
  }
}

Search Result:

"hits": [
      {
        "_index": "67419529",
        "_type": "_doc",
        "_id": "4",
        "_score": 1.8710749,
        "_source": {
          "id": "4",
          "key": "abcd",
          "foo": [
            12
          ]
        }
      },
      {
        "_index": "67419529",
        "_type": "_doc",
        "_id": "1",
        "_score": 1.0498221,
        "_source": {
          "id": "1",
          "key": "abc",
          "foo": [
            1,
            2,
            3
          ]
        }
      },
      {
        "_index": "67419529",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.44839138,
        "_source": {
          "id": "2",
          "key": "ab",
          "foo": [
            4
          ]
        }
      }
    ]

Upvotes: 0

Elasticsearch prefix query

Answers (2)

Related Questions