Roy Leibovitz
Roy Leibovitz

Reputation: 639

Elastic Search multi match gets wrong result

I am sending a query to Elastic Search to find all segments which has a field matching the query. We are implementing a "free search" which the user could write any text he wants and we build a query which search this text throw all the segments fields. Each segment which one (or more) of it's fields has this text should return

For example:

I would like to get all the segments which with the name "tony lopez". Each segment has a field of "first_name" and a field of "last_name".

The query our service builds:

  "multi_match" : {
    "query": "tony lopez",
    "type": "best_fields"
    "fields": [],
    "operator": "OR"
  }

The result from the Elastic using this query is a segment which includes "first_name" field "tony" and "last_name" field "lopez", but also a segment when the "first_name" field is "joe" and "last_name" is "tony".

In this type of query, I would like to recive only the segments which it's name is "tony (first_name) lopez (last_name)"

How can I fix that issue?

Upvotes: 1

Views: 782

Answers (1)

Assael Azran
Assael Azran

Reputation: 2993

Hope i'm not jumping into conclusions too soon but if you want to get only tony and lopez as firstname and lastname use this:

GET my_index/_search
{
  "query": { 
   "bool": {
     "must": [
       {
         "match": {
           "first": "tony"
         }
       },
       {
         "match": {
           "last": "lopez"
         }
       }
     ]
   }
  }
}

But if one of your indexed documents contains for example tony s as firstname, the query above will return it too.

Why? firstname is a text datatype

A field to index full-text values, such as the body of an email or the description of a product. These fields are analyzed, that is they are passed through an analyzer to convert the string into a list of individual terms before being indexed.

More Details

If you run this query via kibana:

POST my_index/_analyze
{
  "field": "first", 
  "text": ["tony s"]
}

You will see that tony s is analyzed as two tokens tony and s.

passed through an analyzer to convert the string into a list of individual terms (tony as a term and s as a term).

That is why the above query returns tony s in results, it matches tony.

If you want to get only tony and lopez exact match then you should use this query:

GET my_index/_search
{
  "query": { 
   "bool": {
     "must": [
       {
         "term": {
           "first.keyword": {
             "value": "tony"
           }
         }
       },
       {
         "term": {
           "last.keyword": {
             "value": "lopez"
           }
         }
       }
     ]
   }
  }
}

Read about keyword datatype

UPDATE

Try this query - it is not perfect same issue with my tony s example and if you have a document with firstname lopez and lastname tony it will find it.

GET my_index/_search
{
  "query": { 
   "multi_match": {
     "query": "tony lopez",
     "fields": [],
     "type": "cross_fields",
     "operator":"AND",
     "analyzer":   "standard"

   }
  }
}

The cross_fields type is particularly useful with structured documents where multiple fields should match. For instance, when querying the first_name and last_name fields for “Will Smith”, the best match is likely to have “Will” in one field and “Smith” in the other

cross fields

Hope it helps

Upvotes: 2

Related Questions