Binoy Bhanujan
Binoy Bhanujan

Reputation: 243

How to implement case sensitive search in elasticsearch?

I have a field in my indexed documents where i need to search with case being sensitive. I am using the match query to fetch the results. An example of my data document is :

{
  "name" : "binoy",
  "age" : 26,
  "country": "India"
}

Now when I give the following query:

{
  “query” : {
    “match” : {
      “name” : “Binoy"
    }
  }
}

It gives me a match for "binoy" against "Binoy". I want the search to be case sensitive. It seems by default,elasticsearch seems to go with case being insensitive. How to make the search case sensitive in elasticsearch?

Upvotes: 15

Views: 25454

Answers (4)

Andrej Maya
Andrej Maya

Reputation: 121

Here is the full index template which worked for my ElasticSearch 5.6:

{
  "template": "logstash-*",
  "settings": {
     "analysis" : {
         "analyzer" : {
             "case_sensitive" : {
                 "type" : "custom",
                 "tokenizer":    "standard",
                 "filter": ["stop", "porter_stem" ]                    
             }
         }
     },        
     "number_of_shards": 5,
     "number_of_replicas": 1      
  },      
  "mappings": {
   "fluentd": {
     "properties": {
       "message": {
         "type": "text",
         "fields": {
           "case_sensitive": { 
             "type": "text",
             "analyzer": "case_sensitive"
           }
         }          
       }
     }
   }
  }
}

As you see, the logs are coming from FluentD and are saved into a timebased index logstash-*. To make sure, I can still execute wildcard queries on the message filed, I put a multi-field mapping on that field. Wildcard/analyzed queries can be done on message field and the case sensitive one on the message.case_sensitive field.

Upvotes: 0

Vineeth Mohan
Vineeth Mohan

Reputation: 19253

In the mapping you can define the field as not_analyzed.

curl -X PUT "http://localhost:9200/sample" -d '{
  "index": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }
}'

echo
curl -X PUT "http://localhost:9200/sample/data/_mapping" -d '{
  "data": {
    "properties": {
      "name": {
        "type": "string",
        "index": "not_analyzed"
      }
    }
  }
}'

Now if you can do normal index and do normal search , it wont analyze it and make sure it deliver case insensitive search.

Upvotes: 8

Prabin Meitei
Prabin Meitei

Reputation: 2000

It depends on the mapping you have defined for you field name. If you haven't defined any mapping then elasticsearch will treat it as string and use the standard analyzer (which lower-cases the tokens) to generate tokens. Your query will also use the same analyzer for search hence matching is done by lower-casing the input. That's why "Binoy" matches "binoy"

To solve it you can define a custom analyzer without lowercase filter and use it for your field name. You can define the analyzer as below

"analyzer": {
                "casesensitive_text": {
                    "type":         "custom",
                    "tokenizer":    "standard",
                    "filter": ["stop", "porter_stem" ]
                }
            }

You can define the mapping for name as below

"name": {
    "type": "string", 
    "analyzer": "casesensitive_text"
}

Now you can do the the search on name.

note: the analyzer above is for example purpose. You may need to change it as per your needs

Upvotes: 6

Andrei Stefan
Andrei Stefan

Reputation: 52368

Have your mapping like:

PUT /whatever
{
  "settings": {
    "analysis": {
      "analyzer": {
        "mine": {
          "type": "custom",
          "tokenizer": "standard"
        }
      }
    }
  },
  "mappings": {
    "type": {
      "properties": {
        "name": {
          "type": "string",
          "analyzer": "mine"
        }
      }
    }
  }
}

meaning, no lowercase filter for that custom analyzer.

Upvotes: 4

Related Questions