an__snatcher
an__snatcher

Reputation: 131

Multiple Analyzers to a specific field

I am working on Elastic Search 6.4.2. I need to apply multiple analyzers to a single field.I am looking to apply snowball and stopword analyzers to title and content field. Iam sharing my mapping Is this the correct approach to define the analyzers.

PUT /some-index
{
    "settings": {
        "index": {
            "number_of_shards": 5,
            "number_of_replicas": 1,
            "refresh_interval": "60s",
            "analysis" : {
              "analyzer" : {
                "my_analyzer" : {
                    "tokenizer" : "standard",
                    "filter" : ["standard", "lowercase", "my_snow"]
               },
               "stop_analyzer": {
                 "type":       "stop",
                 "stopwords":  "_english_"
               }
              } ,


              "filter" : {
                "my_snow" : {
                    "type" : "snowball",
                    "language" : "Lovins"
                }
            }
        }
        }
    },
    "mappings": {
        "doc": {
            "_source": {
                "enabled": true
            },
            "properties": {
                "content": {
                    "type": "text",
                    "index": "true",
                    "store": true,
                     "analyzer":["my_analyzer","stop_analyzer"],
                     "search_analyzer": ["my_analyzer","stop_analyzer"]
                },

                "title": {
                    "type": "text",
                    "index": "true",
                    "store": true,
                            "analyzer":["my_analyzer","stop_analyzer"],
                            "search_analyzer": ["my_analyzer","stop_analyzer"]

                },
                "url": {
                    "type": "text",
                    "index": "true",
                    "store": true

       }

            }
        }
    }
}

Upvotes: 2

Views: 4022

Answers (1)

Kamal Kunjapur
Kamal Kunjapur

Reputation: 8840

What you are looking for is not possible. You cannot have multiple analysers on a single field.

And looking at your requirements, you can simply add two filters, stop and snowball filter and add them as shown in Solution 1 section. I have also mentioned two more approaches just for your information however I believe they won't make much sense for your use-case.

Solution 1: Looking at your requirements use two filters (one for snowball and another for stop words)

Mapping

PUT <your_index_name>
{
  "settings": {
        "analysis" : {
            "analyzer" : {
                "my_analyzer" : {
                    "tokenizer" : "standard",
                    "filter" : ["standard", "lowercase", "my_snow", "my_stop"]
                }
            },
            "filter" : {
                "my_snow" : {
                    "type" : "snowball",
                    "language": "English"

                },
                "my_stop": {
                    "type":       "stop",
                    "stopwords":  "_english_"
                }
            }
        }
    },
    "mappings": {
      "doc": {
            "_source": {
                "enabled": true
            },
            "properties": {
                "title": {
                    "type": "text",
                    "index": "true",
                    "store": true,
                    "analyzer": "my_analyzer"
                }
            }
        }
    }
}

Sample Analyze Query

POST <your_index_name>/_analyze
{
  "analyzer": "my_analyzer",
  "text": "This is the thing, perfection is not worth it"
}

Query Response

{
  "tokens": [
    {
      "token": "thing",
      "start_offset": 12,
      "end_offset": 17,
      "type": "<ALPHANUM>",
      "position": 3
    },
    {
      "token": "perfect",
      "start_offset": 19,
      "end_offset": 29,
      "type": "<ALPHANUM>",
      "position": 4
    },
    {
      "token": "worth",
      "start_offset": 37,
      "end_offset": 42,
      "type": "<ALPHANUM>",
      "position": 7
    }
  ]
}

Solution 2: Using multi-field

However, if you really insist that you would want to have multiple analyzers instead, what you can use is to create multi-field and have them both use separate analyzers.

Below is how your mapping in that case would be. I am only using the below sample for title field and you can apply the changes for other fields. Note below mapping is just for demonstration, I'd suggest solution 1 for your requirement.

PUT <your_index_name>
{  
   "settings":{  
      //same as the one you've posted in the question. 
   },
   "mappings":{  
      "doc":{  
         "_source":{  
            "enabled":true
         },
         "properties":{  
            "title":{  
               "type":"text",
               "index":"true",
               "store":true,
               "analyzer":"my_analyzer",
               "fields":{  
                  "withstopwords":{  
                     "type":"text",
                     "analyzer":"stop_analyzer"
                  }
               }
            }
         }
      }
   }
}

Note that you need to ensure you make use of the correct field name while querying.

Basically use field title for my_analyzer and use title.stopwords for stop_analyzer.

Solution 3: Multiple Indices, Same Alias

For this you would eventually have

some_index1:analyzer_type_1
some_index2:analyzer_type_2
Add alias "some_index" for both some_index1 & some_index2
Query using this alias "some_index"

And then you can query using alias as follows. Note that when you query using some_index, it would end up searching in both indexes some_index1 & some_index2 internally.

POST some_index/_search
{
  "query": {
    "match": {
      "title": "perfection"
    }
  }
}

Hope it helps!

Upvotes: 8

Related Questions