Reputation: 4463
I tried these type of queries in ElasticSearch and wondering which type is the most suitable (most accurate and most efficient) one. Basically, one person can have multiple set of names (array). Names split into firstname, surname and middlename. Some person can have just firstname and surname. Parameter (input) is fullname (combination of firstname, surname and middlename in one string). Fuzzy logic added. One difference I notice is the score.
This is the score of the first result returned.
Is this mean that the second query is the most accurate query for this requirement?
GET /person/_search
{
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "David Bill Gonzalo~",
"fields": [
"nameDetails.name.nameValue.firstName",
"nameDetails.name.nameValue.surname",
"nameDetails.name.nameValue.middleName"
]
}
}
]
}
}
}
GET /person/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"nameDetails.name.nameValue.firstName": "David Bill Gonzalo~"
}
},
{
"match": {
"nameDetails.name.nameValue.surname": "David Bill Gonzalo~"
}
},
{
"match": {
"nameDetails.name.nameValue.middleName": "David Bill Gonzalo~"
}
}
]
}
}
}
GET /person/_search
{
"query": {
"bool": {
"must": [
{
"query_string": {
"fields": [
"nameDetails.name.nameValue.firstName",
"nameDetails.name.nameValue.surname",
"nameDetails.name.nameValue.middleName"
],
"query": "David Bill Gonzalo~"
}
}
]
}
}
}
Upvotes: 1
Views: 1561
Reputation: 16172
First Query:
The multi-match query allows us to run a query on multiple fields. It is an extension of the match query.
As in the first query, you have not specified any type
parameter, so by default best_fields
is considered the type. This finds all the documents which match with the query, but _score
is calculated only from the best field.
To know more about the types of multi-match queries, refer to this part of the documentation.
Second Query:
This is a boolean query with the combination of the bool/should
clause. The score from each matching should clause is taken to calculate the final score here.
Third Query:
In the third query, query_string
is running against multiple fields.
As you have not specified any type
parameter, so by default best_fields
is considered the type. This finds all the documents which match with the query, but _score
is calculated only from the best field.
Since you are querying on multiple fields, with the same query parameter i.e "David Bill Gonzalo~"
, according to me you should use a multi-match query. You can use multi-match queries with different options as well like boosting one or more fields, adding type
parameter in multi-match queries, etc.
Upvotes: 3