srgbnd
srgbnd

Reputation: 5634

Using of possessive_english stemmer in Elasticsearch

I have the following analysis settings:

"settings" : { 
  "index" : { 
    "creation_date" : "1469213620697",
    "analysis" : { 
      "filter" : { 
        "stem_possessive_filter" : { 
          "name" : "possessive_english",
          "type" : "stemmer"
        }   
      },  
      "analyzer" : { 
        "stem_analyzer" : { 
          "filter" : [ "standard", "lowercase", "stem_possessive_filter" ],
          "tokenizer" : "standard"
        }   
      }   
    },  
    "number_of_shards" : "5",
    "number_of_replicas" : "1",
    "uuid" : "VQgaaZquQUOqKNYxGPH7cg",
    "version" : { 
      "created" : "2020199"
    }   
  }
},

Every field of string type has the following mapping:

"field_name" : {
   "type" : "string",
    "analyzer" : "stem_analyzer",
    "search_analyzer" : "standard"
 }

I want to be able to write dementia in alzheimer or dementia in alzheimer's phrase. And as a result, I want to get Dementia in Alzheimer's ....

Multi match query doesn't work if the possession is not used:

{'query': {'multi_match': {'query': "dementia in alzheimer", 'type': 'phrase', 'fields': ['_all']}}}

But it works if the possession is used:

{'query': {'multi_match': {'query': "dementia in alzheimer's", 'type': 'phrase', 'fields': ['_all']}}}

On the other hand, Bool query works if the possession is not used:

{'query': {'bool': {'must': [{'match_phrase': {'Diagnosis': "dementia in alzheimer"}}]}}}

But it doesn't work if the possession is used:

{'query': {'bool': {'must': [{'match_phrase': {'Diagnosis': "dementia in alzheimer's"}}]}}}

How to make all the queries above work?

-- UPDATE --

The bool query works if you add stem_analyzer in the query. Thus you use it during search. And I get results for a phrase both with and without possession. mybody = {'query': {'bool': {'must': [{'match_phrase': {'Diagnosis': {'query': "dementia in alzheimer's", 'analyzer': 'stem_analyzer'}}}]}}}

But, the multi match query stops working at all if you add analyzer. I don't get any results for a phrase both with and without possession. {'query': {'multi_match': {'query': "dementia in alzheimer's", 'type': 'phrase', 'analyzer': 'stem_analyzer', 'fields': ['_all']}}}

Why the analyzer doesn't work for the multi match query?

Upvotes: 2

Views: 3255

Answers (1)

srgbnd
srgbnd

Reputation: 5634

The phrase type doesn't work with stem analyzer for the multi match query. But phrase_prefix type works. Frankly speaking, I don't know why. There is no hint about it in the documentation.

So, the following two multi match queries return the same results for me:

{'query': {'multi_match': {'query': "dementia in alzheimer", 'type': 'phrase_prefix', 'analyzer': 'stem_analyzer', 'fields': ['_all']}}}

{'query': {'multi_match': {'query': "dementia in alzheimer's", 'type': 'phrase_prefix', 'analyzer': 'stem_analyzer', 'fields': ['_all']}}}

In the boolean query, match_phrase works with stem analyzer. The following two queries return the same results:

{'query': {'bool': {'must': [{'match_phrase': {'Diagnosis': {'query': "dementia in alzheimer", 'analyzer': 'stem_analyzer'}}}]}}}

{'query': {'bool': {'must': [{'match_phrase': {'Diagnosis': {'query': "dementia in alzheimer's", 'analyzer': 'stem_analyzer'}}}]}}}

Upvotes: 2

Related Questions