curiousity
curiousity

Reputation: 179

Am I correct that Elasticsearch simple_query_string does not support wildcards, but only prefix queries?

There are a number of related questions on stackoverflow, but they mostly suggest other ways to do wildcards. I'm trying to analyze an existing install, so alternatives are not useful.

I think the issue I ran into was that simple_query_string and wildcard queries do different things with infix *.

the query

r*g

expands to "+msg:r +msg:g" with simple_query_string:

GET /test/_validate/query?rewrite=true
{
  "query": {
    "simple_query_string" : {
        "query": "r*g",
        "fields": ["msg"],
        "default_operator": "AND"
    }
  }
}

Returns

{
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "valid": true,
  "explanations": [
    {
      "index": "test",
      "valid": true,
      "explanation": "+msg:r +msg:g"
    }
  ]
}

Which shows that simple query string is not treading this as a wildcard at all. Not even for r* So, it will not match "running", for example.

On the other hand, wildcard query does handle infix.

GET /test/_validate/query?rewrite=true
{
  "query": {
    "wildcard": {
      "msg": {
        "value": "r*g",
        "case_insensitive": true
      }
    }
  }
}

returns

{
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "valid": true,
  "explanations": [
    {
      "index": "test",
      "valid": true,
      "explanation": """msg:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@78b3b2e7}"""
    }
  ]
}

While the automaton query could use better output, r*g as a wildcard query will match "running", but a simple_query_string will not.

So, am I correct that the sample query string matches very different sets for simple_query_string vs. wildcard query?

Upvotes: 1

Views: 1856

Answers (1)

Sagar Patel
Sagar Patel

Reputation: 5486

Yes, You are right. simple_query_string only support wildcard in end of the query string.

* at the end of a term signifies a prefix query

You can see below line in documentation so it is ignored * in your scenario.

the simple_query_string query does not return errors for invalid syntax. Instead, it ignores any invalid parts of the query string.

You can use query_string of query but it is strict and validate query syntax.

You can use the query_string query to create a complex search that includes wildcard characters, searches across multiple fields, and more. While versatile, the query is strict and returns an error if the query string includes any invalid syntax.

Upvotes: 2

Related Questions