Chakradar Raju
Chakradar Raju

Reputation: 2811

Why is @ not matching for query_string query in elasticsearch?

I've a field with email ids, when I try to match the whole email id, it doesn't match the document, but when I don't include @ the document matches. I tried replacing @ with . and *, none of them helped in matching.

How do I match whole email?

Eg doc:

{
  ...
  "email": "[email protected]"
}

Eg failure query:

{
  "query": {
    "query_string": {
      "default_field": "email",
      "query": "*[email protected]*"
    }
  }
}

Eg success query:

{
  "query": {
    "query_string": {
      "default_field": "email",
      "query": "*ample*"
    }
  }
}

Upvotes: 0

Views: 784

Answers (2)

Amit
Amit

Reputation: 32386

As already mentioned by Richie in another post, here it wasn't matching your search query as, default analyzer in Elastic is standard analyzer, which removes the special character from the text, during tokenization process.

You need to do below things in order to make it work.

  1. Define custom analyzer which uses the UAX URL tokenizer

  2. Use your custom analyzer on the fields where you want @ to be searchable. Define this in your ES schema.

  3. Check the O/P of http://localhost:9200/{your_index_name}/_mapping where replace your_index_name with your index name and verify the fields now uses, custom analyzer.
  4. Re-index the whole the data, as changing analyzer of a field is a breaking changes and only after you re-index the whole data, with new mapping then it will have expected tokens.
  5. Check the tokens generated for your fields, using _analyze api. and now it should have tokens consisting @.

Let me know if you face any issue implementing this.

Upvotes: 1

RichieK
RichieK

Reputation: 554

Yes, so from https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-uaxurlemail-tokenizer.html you can see that Standard analyzer makes

POST _analyze
{
  "text": "Email me at [email protected]"
}

to

[ Email, me, at, john.smith, global, international.com ]

That uax_url_email analyzer makes

 [ Email, me, at, [email protected] ]

Upvotes: 0

Related Questions