Sofia Braun
Sofia Braun

Reputation: 194

Elasticsearch multi_match query not working with synonyms and cross_fields

Elasticsearch multi match query with cross_fiels type and synonyms is not working as expected.

I have the following configuration:

{
    "my_index": {
        "mappings": {
            "my_mapping": {
                "properties": {
                    "@timestamp": {
                        "type": "date"
                    },
                    "@version": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "field1": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "field2": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    }
        },
        "settings": {
            "index": {
                "analysis": {
                    "filter": {
                        "my_synonym_filter": {
                            "type": "synonym",
                            "synonyms": [
                                "matthew,matt,matty",
                                "thomas,tom,thom,tommy"
                            ]
                        }
                    },
                    "analyzer": {
                        "my_synonyms": {
                            "filter": [
                                "lowercase",
                                "my_synonym_filter"
                            ],
                            "tokenizer": "standard"
                        }
                    }
                }
            }
        }
    }
}

And the following query:

{
    "query":{  
        "bool":{  
            "should":[  
               {  
                  "multi_match":{  
                     "fields":[  
                        "field1^8",
                        "field2^2"
                     ],
                     "query":"Matt And Tom Oldfield",
                     "type":"cross_fields",
                     "analyzer": "my_synonyms"
                  }
               }
            ]
        }
     }
 }

But when I execute the query it is not expanding the synonyms into every field, so if I analyze the query the explanation is as follows:

(Synonym(field1:matt field1:matthew field1:matty) blended(terms:[field1:and^8.0, field2:and^2.0]) Synonym(field1:thom field1:thomas field1:tom field1:tommy) blended(terms:[field1:oldfield^8.0, field2:oldfield^2.0]))

So if I have "Tom Oldfield" in field1 and "Matt Oldfield" in field2 the query doesn't match that result as you can see that it only expands the synonyms but for the first field (field1) and not the other.

If I remove the analyzer from the query then it will match a document with "Tom Oldfield" in field1 and "Matt Oldfield" in field2 and the query explanation is as follows:

(blended(terms:[field1:matt^8.0, field2:matt^2.0]) blended(terms:[field1:and^8.0, field2:and^2.0]) blended(terms:[field1:tom^8.0, field2:tom^2.0]) blended(terms:[field1:oldfield^8.0, field2:oldfield^2.0]))

Is there a way to make the synonyms expand to every field?

Upvotes: 1

Views: 1335

Answers (1)

Ivan Mamontov
Ivan Mamontov

Reputation: 2924

I am not able to reproduce your issue on my env with elastic 5.5.0. Here is my MVCE settings:

{
  "settings": {
    "index": {
      "analysis": {
        "filter": {
          "my_synonym_filter": {
            "type": "synonym",
            "synonyms": [
              "matthew,matt,matty",
              "thomas,tom,thom,tommy"
            ]
          }
        },
        "analyzer": {
          "my_synonyms": {
            "filter": [
              "lowercase",
              "my_synonym_filter"
            ],
            "tokenizer": "standard"
          }
        }
      }
    }
  },
  "mappings": {
    "my_mapping": {
      "properties": {
        "field1": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "field2": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }
  }
}

The following document was indexed:

{ "field1": "Tom Oldfield", "field2": "Matt Oldfield"}

On provided query ES creates the following Lucene query

((field1:matt)^8.0 | (field1:matthew)^8.0 | (field1:matty)^8.0 | (field2:matt)^2.0 | (field2:matthew)^2.0 | (field2:matty)^2.0) 
((field1:and)^8.0 | (field2:and)^2.0) 
((field1:tom)^8.0 | (field1:thomas)^8.0 | (field1:thom)^8.0 | (field1:tommy)^8.0 | (field2:tom)^2.0 | (field2:thomas)^2.0 | (field2:thom)^2.0 | (field2:tommy)^2.0) 
((field1:oldfield)^8.0 | (field2:oldfield)^2.0))

where synonym is expanded for every field.

Upvotes: 1

Related Questions