Reputation: 1575
Suppose I have a AND/MUST operator query in elasticsearch on two different indexed fields as follows :
"bool": {
"must": [
{
"match" : {
"query": "Will",
"fields": [ "first",],
"minimum_should_match": "100%" // assuming this is q1
}
},
{
"match" : {
"query": "Smith",
"fields": [ "last" ]
"minimum_should_match": "100%" //assuming this is q2
}
}
]
}
Now I wanted to know how in background elastic search will fetch documents. Whether it will get all id of documents where index matches q1 and then iterate over all which also has index q2.
or
It does intersection of two sets and how?.
How can I index my data to optimize and QUERIES on two separate fields?
Upvotes: 1
Views: 124
Reputation: 36777
First some basics: ElasticSearch uses lucene behind the scenes. In lucene a query returns a scorer, and that scorer is responsible for returning the list of documents matching the query.
Your boolean query will internally be translated to lucene BooleanQuery
which in this case will return ConjunctionScorer
, as it has only must
clauses.
Each of the clauses is a TermQuery
that returns a TermScorer
which, when advanced, gives next matching document in increasing order of document id.
ConjunctionScorer
computes intersection of the matching documents returned by scorers for each clause by simply advancing each scorer in turns.
So you can think of TermScorer
as of one returning an ordered list of the documents, and of ConjunctionScorer
as of one simply intersecting two ordered lists.
There's not much you can do to optimize it. Maybe, since you're not really interested in scores, you could use a filter query instead and let ElasticSearch cache it.
Upvotes: 3