Reputation: 199
How to do a prefix search in Elasticsearch ?
For example if I have the following indexed documents:
[{
"id": "1",
"key": "abc",
"foo": [1, 2, 3]
},
{
"id": "2",
"key": "ab",
"foo": [4]
},
{
"id": "3",
"key": "xyz",
"foo": [9, 10]
},
{
"id": "4",
"key": "abcd",
"foo": [12]
}
]
Now I want to have a query on attribute "key" with value "abcdef".
I expect the following documents to match the query.
document id | matched | reason |
---|---|---|
"1" | YES | "abc" is a prefix of "abcdef" |
"2" | YES | "ab" is a prefix of "abcdef" |
"3" | NO | "xyz" is not a prefix of "abcdef" |
"4" | YES | "abcd" is a prefix of "abcdef" |
Upvotes: 0
Views: 1661
Reputation: 217294
Using an edge-ngram tokenizer (or token filter) is correct but you should only apply it at search time since the documents already contain the prefixes you're searching.
So you're index settings and mappings should look like this:
PUT test
{
"settings": {
"analysis": {
"analyzer": {
"prefix_analyzer": {
"tokenizer": "prefix_tokenizer"
}
},
"tokenizer": {
"prefix_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 6,
"token_chars": [
"letter",
"digit"
]
}
}
},
"index": {
"max_ngram_diff": 10
}
},
"mappings": {
"properties": {
"key": {
"type": "text",
"analyzer": "keyword",
"search_analyzer": "prefix_analyzer"
}
}
}
}
Then, your search query can look like this:
POST test/_search
{
"query": {
"match": {
"key": "abcdef"
}
}
}
What is going to happen is that the input token abcdef
will get tokenized into:
ab
abc
abcd
abcde
abcdef
In the results you'll get:
Upvotes: 1
Reputation: 16172
You can use edge_ngram tokenizer to query on attribute "key" with value "abcdef"
Adding a working example
Index Mapping:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 6,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 10
},
"mappings": {
"properties": {
"key": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
Search Query:
{
"query": {
"match":{
"key":"abcdef"
}
}
}
Search Result:
"hits": [
{
"_index": "67419529",
"_type": "_doc",
"_id": "4",
"_score": 1.8710749,
"_source": {
"id": "4",
"key": "abcd",
"foo": [
12
]
}
},
{
"_index": "67419529",
"_type": "_doc",
"_id": "1",
"_score": 1.0498221,
"_source": {
"id": "1",
"key": "abc",
"foo": [
1,
2,
3
]
}
},
{
"_index": "67419529",
"_type": "_doc",
"_id": "2",
"_score": 0.44839138,
"_source": {
"id": "2",
"key": "ab",
"foo": [
4
]
}
}
]
Upvotes: 0