Reputation: 1828
How do you write NEST code to generate an elastic search query for this simple boolean logic?
term1 && (term2 || term3 || term4)
Pseudo code on my implementation of this logic using Nest (5.2) statement to query ElasticSearch (5.2)
// additional requirements
( truckOemName = "HYSTER" && truckModelName = "S40FT" && partCategoryCode = "RECO" && partID != "")
//Section I can't get working correctly
AND (
( SerialRangeInclusiveFrom <= "F187V-6785D" AND SerialRangeInclusiveTo >= "F187V-6060D" )
OR
( SerialRangeInclusiveFrom = "" || SerialRangeInclusiveTo = "" )
)
The "Combining queries with || or should clauses" in Writing Bool Queries mentions
The
bool
query does not quite follow the same boolean logic you expect from a programming language.term1 && (term2 || term3 || term4)
does not become
bool
|___must
| |___term1
|
|___should
|___term2
|___term3
|___term4
you could get back results that only contain term1
which is exactly what I think is happening.
But their answer to solve this is above my understanding of how to apply it with Nest. The answer is either?
- Add parentheses to force evaluation order (i am)
- Use
boost
factor? (what?)
Here's the NEST code
var searchDescriptor = new SearchDescriptor<ElasticPart>();
var terms = new List<Func<QueryContainerDescriptor<ElasticPart>, QueryContainer>>
{
s =>
(s.TermRange(r => r.Field(f => f.SerialRangeInclusiveFrom)
.LessThanOrEquals(dataSearchParameters.SerialRangeEnd))
&&
s.TermRange(r => r.Field(f => f.SerialRangeInclusiveTo)
.GreaterThanOrEquals(dataSearchParameters.SerialRangeStart)))
//None of the data that matches these ORs returns with the query this code generates, below.
||
(!s.Exists(exists => exists.Field(f => f.SerialRangeInclusiveFrom))
||
!s.Exists(exists => exists.Field(f => f.SerialRangeInclusiveTo))
)
};
//Terms is the piece in question
searchDescriptor.Query(s => s.Bool(bq => bq.Filter(terms))
&& !s.Terms(term => term.Field(x => x.OemID)
.Terms(RulesHelper.GetOemExclusionList(exclusions))));
searchDescriptor.Aggregations(a => a
.Terms(aggPartInformation, t => t.Script(s => s.Inline(script)).Size(50000))
);
searchDescriptor.Type(string.Empty);
searchDescriptor.Size(0);
var searchResponse = ElasticClient.Search<ElasticPart>(searchDescriptor);
Here's the ES JSON query it generates
{
"query":{
"bool":{
"must":[
{
"term":{ "truckOemName": { "value":"HYSTER" }}
},
{
"term":{ "truckModelName": { "value":"S40FT" }}
},
{
"term":{ "partCategoryCode": { "value":"RECO" }}
},
{
"bool":{
"should":[
{
"bool":{
"must":[
{
"range":{ "serialRangeInclusiveFrom": { "lte":"F187V-6785D" }}
},
{
"range":{ "serialRangeInclusiveTo": { "gte":"F187V-6060D" }}
}
]
}
},
{
"bool":{
"must_not":[
{
"exists":{ "field":"serialRangeInclusiveFrom" }
}
]
}
},
{
"bool":{
"must_not":[
{
"exists":{ "field":"serialRangeInclusiveTo" }
}
]
}
}
]
}
},
{
"exists":{
"field":"partID"
}
}
]
}
}
}
Here's the query we'd like it to generate that seems to work.
{
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"term": { "truckOemName": { "value": "HYSTER" }}
},
{
"term": {"truckModelName": { "value": "S40FT" }}
},
{
"term": {"partCategoryCode": { "value": "RECO" }}
},
{
"exists": { "field": "partID" }
}
],
"should": [
{
"bool": {
"must": [
{
"range": { "serialRangeInclusiveFrom": {"lte": "F187V-6785D"}}
},
{
"range": {"serialRangeInclusiveTo": {"gte": "F187V-6060D"}}
}
]
}
},
{
"bool": {
"must_not": [
{
"exists": {"field": "serialRangeInclusiveFrom"}
},
{
"exists": { "field": "serialRangeInclusiveTo"}
}
]
}
}
]
}
}
]
}
}
}
Upvotes: 4
Views: 4648
Reputation: 125488
With overloaded operators for bool
queries, it is not possible to express a must
clause combined with a should
clause i.e.
term1 && (term2 || term3 || term4)
becomes
bool
|___must
|___term1
|___bool
|___should
|___term2
|___term3
|___term4
which is a bool
query with two must
clauses where the second must
clause is a bool
query where there has to be a match for at least one of the should
clauses. NEST combines the queries like this because it matches the expectation for boolean logic within .NET.
If it did become
bool
|___must
| |___term1
|
|___should
|___term2
|___term3
|___term4
a document is considered a match if it satisfies only the must
clause. The should
clauses in this case act as a boost i.e. if a document matches one or more of the should
clauses in addition to the must
clause, then it will have a higher relevancy score, assuming that term2
, term3
and term4
are queries that calculate a relevancy score.
On this basis, the query that you would like to generate expresses that for a document to be considered a match, it must match all of the 4 queries in the must
clause
"must": [
{
"term": { "truckOemName": { "value": "HYSTER" }}
},
{
"term": {"truckModelName": { "value": "S40FT" }}
},
{
"term": {"partCategoryCode": { "value": "RECO" }}
},
{
"exists": { "field": "partID" }
}
],
then, for documents matching the must
clauses, if
it has a serialRangeInclusiveFrom
less than or equal to "F187V-6785D"
and a serialRangeInclusiveFrom
greater than or equal to "F187V-6060D"
or
serialRangeInclusiveFrom
and serialRangeInclusiveTo
then boost that documents relevancy score. The crucial point is that
If a document matches the
must
clauses but does not match any of theshould
clauses, it will still be a match for the query (but have a lower relevancy score).
If that is the intent, this query can be constructed using the longer form of the Bool
query
Upvotes: 2