Reputation: 27
I'm trying to work out for each country how many times the UK appears with the following query:
db.movies.aggregate([
{$unwind:"$countries"},
{$group:{_id:"$countries", total : { $sum: {$cond: [ {
"countries": {"$in" : ["UK", ["$countries"]], "$not" : { "$size" : 1.0}}}, 1, 0]}}}},
{$match: { total: { $gte: 10 }}}
])
which gives me an object representing an expression must have exactly one field: { $in: [ "UK", [ "$countries" ] ], $not: { $size: 1.0 } }", although when I try the part giving me problems just by itself with a find it works fine:
db.movies.find(
{ "countries" : { "$in" : ["UK", ["$countries"]], "$not" : { "$size" : 1}}},
{ "_id" : 0, "countries" : 1});
result:
{ "countries" : [ "UK", "New Zealand" ] }
{ "countries" : [ "USA", "UK" ] }
{ "countries" : [ "UK", "USA", "France" ] }
{ "countries" : [ "UK", "USA" ] }
{ "countries" : [ "USA", "UK" ] }
{ "countries" : [ "UK", "Italy" ] }...
Upvotes: 1
Views: 3628
Reputation: 28326
One of the most confusing aspects of MongoDB is that it contains 2 contexts, the Mongo Query Language, and the Aggregation Framework. There are operators that exist in both, but have very different syntax and action in each context.
The MQL is used in find
filter expressions, and in $match
expressions in an aggregation pipeline.
Aggregation expressions are used throughout the rest of the pipeline, and may be used in find
or $match
as part of an $expr expression.
Confused yet?
Good, so was I when I first saw those.
I see 3 problems with the query provided:
$cond
instead of a boolean expressionFrom the documentation, $cond has the following form:
{ $cond: [ <boolean-expression>, <true-case>, <false-case> ] }
In your exmaple, the <boolean-expression>
is
{"countries": {"$in" : ["UK", ["$countries"]], "$not" : { "$size" : 1.0}}}
That is a valid query object for the Mongo Query Language, but it is not a boolean expression. I'll come back to that in moment.
$size as an MQL operator matches any array with the number of elements specified by the argument. The usage in your query is consistent with using $size
as this operator.
However, in a $cond
expression inside of a $group
stage, the aggregation form of the $size operator should be used. In this context $size
counts and returns the total number of items in an array. i.e. not a boolean expression.
Inside of the $cond
expression, to test if the size of the countries
array is not equal to 1, use:
{$ne: [{$size:"$countries"},1]}
When using $in as an MQL operator, the syntax is:
{ field: { $in: [<value1>, <value2>, ... <valueN> ] } }
When using $in as an aggregation operator, the syntax is:
{ $in: [ <expression>, <array expression> ] }
In this query, to test if the countries
array contains "UK", use:
{$in: ["UK", "$countries"]}
To combine these all together, use the $and operator:
{$cond: [
{$and: [
{$in: ["UK", "$countries"]},
{$ne: [{$size:"$countries"},1]}
]},
1,
0
]}
The $unwind
stage will explode the array so that each document passed to the following pipeline stages contains only one element from the original array. Thus the field countries
in the documents passed to the $group
stage will be a single string value, not an array.
To test this, run the aggregation with just the $unwind
stage, and examine the results.
Based on the statements in the comments, the goal is to count the occurrences of each other country that appears in an array that also contains "UK". To accomplish this, use the following aggregation stages:
db.movies.aggregate([
{$match: {countries: "UK"}},
{$unwind: "$countries"},
{$match: {countries: {$ne: "UK"}}},
{$group: {
_id: "$countries",
count: {$sum: 1}
}}
])
Upvotes: 3