lalilulelo_1986
lalilulelo_1986

Reputation: 545

Filter query words(multi-lang) which I don't want Elasticsearch use for search

I have this kind of query. When I pass query argument like TOO Big House I don't want Elastic to search by word TOO. Because there are a lot of this kind names with TOO in begining. There is nothing about it in documentation. Is it posible in ElasticSearch?

{"bool": {
            "must": [
                {
                    "match": {
                        "consignorOrganizationName": {
                            "query":"?0"
                        }
                    }
                }
            ]
}}

Field from index:

"properties": {
     "consignorOrganizationName": {
          "type": "text"
     }
}

After I figured out that the problem can be because of multi-lang stopwords. I tried this and looks like this works for me. But I'm not sure if this approach is good

"analyzer": {
    "company_analyzer": {
        "tokenizer":  "standard",
        "filter": [
            "lowercase",
            "russian_stop",
            "english_stop"
        ]
    }
},
"filter": {
    "russian_stop": {
        "type":       "stop",
        "ignore_case": true,
        "stopwords":  ["ТОО"] 
    },
    "english_stop": {
        "type":       "stop",
        "ignore_case": true,
        "stopwords":  ["TOO"] 
    }
}

Upvotes: 0

Views: 85

Answers (1)

glenacota
glenacota

Reputation: 2547

If you just want to rely on text analysis, you can create a custom analyzer with a stop token filter in which you specify your custom stopword TOO (see docs).

PUT your-index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "custom_analyzer": {
          "tokenizer": "whitespace",
          "filter": [ "my_custom_stop_words_filter" ]
        }
      },
      "filter": {
        "my_custom_stop_words_filter": {
          "type": "stop",
          "ignore_case": true,
          "stopwords": [ "TOO" ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "consignorOrganizationName": {
        "type": "text",
        "analyzer": "custom_analyzer"
      }
    }
  }
}

Upvotes: 1

Related Questions