Karim Barrane
Karim Barrane

Reputation: 23

Trouble making a query with Elasticsearch

Well guys I hope you're doing fine in this epidemic times, I'm having trouble in neglecting special characters in a query at elasticsearch : Here is what I want to do :

Select * from table where ext like %6500% and start_time like %-01-% 

Here is what I did:

   "query": {
       "bool": {
           "must": [
               {
                   "query_string": {
                       "ext": "*6500*",
                       "fields": [
                           "extension"
                       ],
                       "analyze_wildcard": true
                   }
               },
               {
                   "query_string": {
                       "query": "*\\-01\\-*",
                       "fields": [
                           "start_time"
                       ],
                       "analyze_wildcard": true
                   }
               }
           ]
       }
   }

The first one works but the second doesn't give what I want. Btw the field start_time is like this for example: 2020-01-03 15:03:45 and it's a heyword type (I found it like that).

Upvotes: 0

Views: 68

Answers (2)

jaspreet chahal
jaspreet chahal

Reputation: 9099

You are indexing your field with type text and sub fields of keyword type. Text fields are broken in tokens ex "2020-01-12" will be stored as ["2020","01","12"]. You need to run your query on keyword field using "start_time.keyword"

{
  "query": {
       "bool": {
           "must": [
               {
                   "query_string": {
                       "query": "*-01-*",
                       "fields": [
                           "start_time.keyword" --> note
                       ],
                       "analyze_wildcard": true
                   }
               }
           ]
       }
   }
}

As @joe mentioned wildcard queries have poor performance it is better to use date field

Upvotes: 0

Joe - Check out my books
Joe - Check out my books

Reputation: 16925

If you're forced to use the keyword type in your start_time, the following works -- no need for the leading & trailing wildcards since your start_time will adhere to a certain format:

GET karim/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "query": "-01-",
            "fields": [
              "start_time"
            ]
          }
        }
      ]
    }
  }
}

It's advisable, though, to use date whenever working with date(time)s. So set your index up like so:

PUT karim
{
  "mappings": {
    "properties": {
      "start_time": {
        "type": "date",
        "format": "YYYY-MM-DD HH:mm:ss"
      }
    }
  }
}

and query like so

GET karim/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "start_time": {
              "gte": "01",
              "lt": "02",
              "format": "MM"
            }
          }
        }
      ]
    }
  }
}

for the month of January of any given year. Adjust the format to match a specific year etc.

This approach is guaranteed to be faster than wildcard textual queries, esp. when you're querying multiple ranges and, possibly, intend to aggregate down the road.

Upvotes: 0

Related Questions