Reputation: 79
I am working with elastic search and am trying to look for a substring inside a field. For example - searching for the string tac in stack overflow . I am using the MultiMatchQuery for this but it does not work. Here is a snippet of my code (first_name is the field name).
searchString = "*" + searchString.toLowerCase() + "*";
MultiMatchQueryBuilder mqb = new MultiMatchQueryBuilder("irs", first_name);
mqb.type(MultiMatchQueryBuilder.Type.PHRASE);
BoolQueryBuilder searchQuery = boolQuery();
searchQuery.should(mqb);
NativeSearchQueryBuilder queryBuilder = new NativeSearchQueryBuilder();
queryBuilder.withQuery(searchQuery);
NativeSearchQuery query = queryBuilder.build();
When I search for tac it does not return any results. When I search for stack or overflow it does return stack overflow.
So it looks for the exact string. I tried using MultiMatchQueryBuilder.Type.PHRASE_PREFIX
but it looks for the phrases starting with the substring. It works with strings like stac or overf but not tac or tack.
Any suggestions on how to fix it?
Upvotes: 1
Views: 647
Reputation: 32376
Macth query is analyzed and applied the same analyzer which is applied during the index time, I believe you are using the standard
analyzer, which generated below tokens
POST http://localhost:9200/_analyze
{
"text": "stack overflow",
"analyzer" : "standard"
}
{
"tokens": [
{
"token": "stack",
"start_offset": 0,
"end_offset": 5,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "overflow",
"start_offset": 6,
"end_offset": 14,
"type": "<ALPHANUM>",
"position": 1
}
]
}
Hence searching for tac
doesn't match any token in an index, you need to change the analyzer so that it matches the query time tokens to index time tokens.
n-gram tokenizer can solve the issue.
Example
Index mapping
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "ngram",
"min_gram": 1,
"max_gram": 10
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
},
"index.max_ngram_diff" : 10
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
}
}
}
}
Index sample doc
{
"title" : "stack overflow"
}
And search query
{
"query": {
"match": {
"title": "tac"
}
}
}
And search result
"hits": [
{
"_index": "65241835",
"_type": "_doc",
"_id": "1",
"_score": 0.4739784,
"_source": {
"title": "stack overflow"
}
}
]
}
Upvotes: 1