Arnold
Arnold

Reputation: 89

ElasticSearch inconsistent wildcard search

I have a strange issue with my wildcard search. I've created an index with the following mapping: enter image description here

I have the following document there:

enter image description here

When I'm performing the following query, I'm getting the document:

{
  "query": {
    "wildcard" : { "email" :  "*asdasd*"  }
  },
  "size": "10",
  "from": 0
}

But when I'm doing the next request, I'm not getting anything:

{
  "query": {
    "wildcard" : { "email" :  "*one-v*"  }
  },
  "size": "10",
  "from": 0
}

Can you please explain the reason for it? Thank you

Upvotes: 0

Views: 113

Answers (2)

Tushar Shahi
Tushar Shahi

Reputation: 20441

This has to do with how text fields are saved. By default standard analyzer is used.

This is an example from the documentation which fits your case too :

The text "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone." is broken into terms [ the, 2, quick, brown, foxes, jumped, over, the, lazy, dog's, bone ].

As you can see Brown-foxes is not a single token. The same will go for one-v, it will break into one and v.

Upvotes: 0

Bhavya
Bhavya

Reputation: 16172

Elasticsearch uses a standard analyzer if no analyzer is specified. Assuming that the email field is of text type, so "[email protected]" will get tokenized into

{
  "tokens": [
    {
      "token": "asdasd",
      "start_offset": 0,
      "end_offset": 6,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "one",
      "start_offset": 7,
      "end_offset": 10,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "v.co.il",
      "start_offset": 11,
      "end_offset": 18,
      "type": "<ALPHANUM>",
      "position": 2
    }
  ]
}

Now, when you are doing a wildcard query on the email field, then it will search for the tokens, created above. Since there is no token that matches one-v, you are getting empty results for the second query.

It is better to use a keyword field for wildcard queries. If you have not explicitly defined any index mapping then you need to add .keyword to the email field. This uses the keyword analyzer instead of the standard analyzer (notice the ".keyword" after the email field).

Modify your query as shown below

{
  "query": {
    "wildcard": {
      "email.keyword": "*one-v*"
    }
  }
}

Search Result will be

"hits": [
      {
        "_index": "67688032",
        "_type": "_doc",
        "_id": "1",
        "_score": 1.0,
        "_source": {
          "email": "[email protected]"
        }
      }
    ]

Otherwise you need to change the data type of the email field from text to keyword type

Upvotes: 1

Related Questions