Reputation: 1180
I have a collection with documents like this one:
{
_id : "1",
arrayProperty : ["1","2","3"]
}
I want to find documents which having all elements of arrayProperty contained in some array.
Suppose I have this collection :
{_id : "1", arrayProperty : ["1", "2"]}
{_id : "2", arrayProperty : ["1", "4"]}
{_id : "3", arrayProperty : ["1", "7", "8"]}
{_id : "4", arrayProperty : ["1", "9"]}
and I want to find the documents which having All their arrayProperty elements contained in ["1", "2", "3", "4"]
It should return :
{_id : "1", arrayProperty : ["1", "2"]}
{_id : "2", arrayProperty : ["1", "4"]}
Upvotes: 0
Views: 59
Reputation: 151170
The basic concept here is to look for things which are NOT in the list of possible values per array element and then "exclude" that document. Which means using $elemMatch
with $nin
for the list and $not
to reverse the logic:
db.collection.find({
"arrayProperty": {
"$not": { "$elemMatch": { "$nin": ["1", "2", "3", "4"] } }
}
})
Which returns the correct documents:
{ "_id" : "1", "arrayProperty" : [ "1", "2" ] }
{ "_id" : "2", "arrayProperty" : [ "1", "4" ] }
That actually uses the native operators in the "query engine" to evaluate the expression as opposed to "forced calculation" via $expr
or $where
which we will mention later. It's the right results, but the only problem here is the operator pattern actually negates the usage of any index. Fortunately there is something we can do about that:
db.collection.find({
"arrayProperty": {
"$in": ["1", "2", "3", "4"],
"$not": { "$elemMatch": { "$nin": ["1", "2", "3", "4"] } }
}
})
Whilst it might seem a little funny at first, adding the $in
here is a valid condition. What it does for the query is enforce that an index is actually used in selection of the valid documents. In the question sample that is still "ALL" of the documents presented, but in the real world not all things will typically match the list of arguments.
Essentially it changes the parsed query conditions from this:
"winningPlan" : { "stage" : "COLLSCAN"
To this:
"winningPlan" : { "stage" : "FETCH",
"inputStage" : { "stage" : "IXSCAN",
That makes $in
a worthwhile filter to add to the expression and the "native query operator" expression is the fastest way to do this.
The problem with $expr
( aside from being only available from MongoDB 3.6 ) is that it means the "whole collection" needs to be scanned in order to apply the "aggregation operator expression" which it contains. Of course, we also just learned what $in
adds to the query
db.collection.find({
"arrayProperty": { "$in": ["1", "2", "3", "4"] },
"$expr": { "$setIsSubset": ["$arrayProperty", ["1", "2", "3", "4"]] }
})
This has a similar IXSCAN
input where an index is present because of the $in
and only using the $setIsSubset
boolean condition in order to reject the other documents found in the index selection.
Earlier forms of usage with prior MongoDB release are less ideal:
db.collection.aggregate([
{ "$match": { "$in": ["1", "2", "3", "4"] } },
{ "$redact": {
"if": { "$setIsSubset": ["$arrayProperty", ["1", "2", "3", "4"]] },
"then": "$$KEEP",
"else": "$$PRUNE"
}}
])
Or using $where
:
db.collection.find({
"arrayProperty": { "$in": ["1", "2", "3", "4"] },
"$where": function() {
return this.arrayProperty.every(a => ["1", "2", "3", "4"].some(s => a === s))
}
})
So all actually get the job done but the combination of the $elemMatch
with $nin
and $not
, also including the $in
operator for index selection is actually what you really want. And it's supported in all versions as well.
Upvotes: 1