Suraj Dalvi
Suraj Dalvi

Reputation: 1078

ElasticSearch escape special character

How I escape Unicode characters using a query_string query? For example, my documents consist of the following data:

{
    "title":"Sachin$tendular"
     } 

I am using the query_string query in the following way:

{
  "query": {
    "query_string": {
      "fields": ["title"],
      "query": "*Sachin$*"
    }
  }
}

but it is not giving me any result and if I removed $ from the query it works

SO how we can handle $ here?

Hi my title field mapping is

"title" : {
                    "type" : "text",
                    "fields" : {
                      "keyword" : {
                        "type" : "keyword"
                      }
                    }
                  }

Whem I using query like

[{
    "query_string": {
         "query": "*character*",
           "fields": ["title.raw"],
            "default_operator": "OR"
     }
 }]

It works but when I am using query like

[{
        "query_string": {
             "query": "*Special character*",
               "fields": ["title.raw"],
                "default_operator": "OR"
         }
     }]

it not works but

[{
            "query_string": {
                 "query": "*Special character*",
                   "fields": ["title"],
                    "default_operator": "OR"
             }
         }]

it works and my Title data like this:

Special character test ! @ # $ % ^ & * ( ) _ + - = { } [ ] | \\ ; : \" ' < > , . / ? ~ ` ₹

SO how I can get all combination result here using a single query? and when to use title and title.raw here

Upvotes: 0

Views: 3065

Answers (1)

Bhavya
Bhavya

Reputation: 16172

Assuming that the title field is of text type, so elasticsearch uses standard analyzer if no analyzer is defined for text type field. Sachin$tendular gets tokenized to

{
  "tokens": [
    {
      "token": "sachin",
      "start_offset": 0,
      "end_offset": 6,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "tendular",
      "start_offset": 7,
      "end_offset": 15,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
} 

You can use query string on title.keyword field if you are using the default mapping. This will use a keyword analyzer, instead of the standard analyzer. Otherwise, change the data type of the title field to that of keyword type

{
  "query": {
    "query_string": {
      "fields": ["title.keyword"],
      "query": "*Sachin$*"
    }
  }
}

Otherwise, you can update your field mapping, use multi fields for the title field, as shown in the documentation.

Update 1:

You can use a wildcard query instead to search for wildcard expression with spaces

{
  "query": {
    "wildcard": {
      "title.keyword": {
        "value": "*Special character*"
      }
    }
  }
}

Upvotes: 1

Related Questions