Reputation: 6964
I have a MongoDB 3.2 server. My collection contains documents like the following:
{
"name": "string",
"explicitMods": [
"+48 to Blah",
"-13% to Blah",
"12 to 18 to Blah"
]
}
If I write this:
myCollection.find({ "explicitMods": /bad string/ })
I get zero results, as expected.
However if I write this:
myCollection.find({ "explicitMods": /\d+ to \d+/ })
I get all documents in the collection. This is unexpected because I actually want documents containing substrings like 12 to 18
. If I change the regexp to /\d+ to \d+z/
it correctly matches nothing.
Upvotes: 3
Views: 95
Reputation: 151112
The query you are issuing "correctly" returns the documents that actually match the condition you ask it to. That being that "at least one" array element in the property you are testing actually matches the condition in the query.
From this we can surmise two possible outcomes:
Your intent is to only return the documents where all array entries satisfy the condition.
Your intent is to "filter" the entries from the "array within the document" only returning those results which satisfy the condition.
From these there are varying approaches. Firstly being that there is in fact no such query operator for MongoDB which demands that "all" array elements must be satisfied by the given condition with a "regular query". Therefore, you need to apply logic in a different form.
One such option is to use the JavaScript evaluation of $where
in a manner that inspects the array content. Here you can apply Array.every()
in order to test your condition, in addition to the regular query filter of course since that actually is doing some useful work.
Given source documents like:
/* 1 */
{
"_id" : ObjectId("5993a35be38f41729f1d6501"),
"name" : "string",
"explicitMods" : [
"+48 to Blah",
"-13% to Blah",
"12 to 18 to Blah"
]
}
/* 2 */
{
"_id" : ObjectId("5993a35be38f41729f1d6502"),
"name" : "string",
"explicitMods" : [
"12 to 18 to Blah"
]
}
Where your intention is only to return the "document" which matches "all" array elements, you issue the statement:
db.myCollection.find({
"explicitMods": /\d+ to \d+/,
"$where": function() { return this.explicitMods.every(e => /\d+ to \d+/.test(e)) }
}
})
Which returns only the matching document:
{
"_id" : ObjectId("5993a35be38f41729f1d6502"),
"name" : "string",
"explicitMods" : [
"12 to 18 to Blah"
]
}
In the alternate case to using $where
, the aggregation framework of MongoDB allows expressions using "native coded operators" which generally apply faster than JavaScript interpreted expressions. However there actually is no such "logical operator" equivalent ( See SERVER-11947 ) of $regex
that is applicable to aggregation operations such as $redact
.
Therefore the only approach available here is to instead use $match
with regular query conditions "after" the array elements have been denormalized using $unwind
:
db.myCollection.aggregate([
// Match "possible" documents
{ "$match": { "explicitMods": /\d+ to \d+/ } },
// unwind to denormalize
{ "$unwind": "$explicitMods" },
// Match on the "array" items now as documents
{ "$match": { "explicitMods": /\d+ to \d+/ } },
// Optionally "re-group" back to documents with only matching array items
{ "$group": {
"_id": "$_id",
"name": { "$first": "$name" },
"explicitMods": { "$push": "$explicitMods" }
}}
])
And that one will return "both" documents, but only those with matching array items:
/* 1 */
{
"_id" : ObjectId("5993a35be38f41729f1d6501"),
"name" : "string",
"explicitMods" : [
"12 to 18 to Blah"
]
}
/* 2 */
{
"_id" : ObjectId("5993a35be38f41729f1d6502"),
"name" : "string",
"explicitMods" : [
"12 to 18 to Blah"
]
}
Of course you can apply a "variation" on that theme and "test the length" of the array against the filter condition in order to decide which document to return:
db.myCollection.aggregate([
{ "$match": { "explicitMods": /\d+ to \d+/ } },
{ "$addFields": { "origSize": { "$size": "$explicitMods" } } },
{ "$unwind": "$explicitMods" },
{ "$match": { "explicitMods": /\d+ to \d+/ } },
{ "$group": {
"_id": "$_id",
"name": { "$first": "$name" },
"origSize": { "$first": "$origSize" },
"explicitMods": { "$push": "$explicitMods" },
}},
{ "$redact": {
"$cond": {
"if": {
"$eq": [
{ "$size": "$explicitMods" },
"$origSize"
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}}
])
But whilst it does the same thing as the original option with $where
using "native operators", the general cost of such operations as $unwind
makes it's utility questionable, and therefore likely to take considerably more time and resources to produce the result than the original query.
Upvotes: 1