Reputation: 24527
I have a flat collection of documents, where some documents have a parent: ObjectId
field, which points another document from the same collection, i.e.:
{id: 1, metadata: {text: "I'm a parent"}}
{id: 2, metadata: {text: "I'm child 1", parent: 1}}
Now I'd like to retrieve all parents where metadata.text = "I'm a parent"
plus it's child elements. But I want that data in a nested format, so I can simply process it afterwards without having a look at metadata.parent
. The output should look like:
{
id: 1,
metadata: {text: "I'm a parent"},
children: [
{id: 2, metadata: {text: "I'm child 1", parent: 1}}
]
}
(children
could also be part of the parent's metadata
object if that's easier)
Why don't I save the documents in a nested structure? I don't want to store the data in a nested format in DB, because those documents are part of GridFS.
The main problem is: How can I tell MongoDB to nest a whole document? Or do I have to use Mongo's aggregation framework for that task?
Upvotes: 2
Views: 1884
Reputation: 136
I wanted a similar result to Neil Lunn's answer except I wanted to fetch all parents regardless of them having children or not. I also wanted to generalise it to work across any collection that had a single level of nested children.
Here's my query based on Neil Lunn's answer
db.collection.aggregate([
{
$group: {
_id: {
$ifNull: ["$parent", "$_id"]
},
parent: {
$addToSet: {
$cond: [
{
$ifNull: ["$parent", false]
}, false, "$$ROOT"
]
}
},
children: {
$push: {
$cond: [
{
$ifNull: ["$parent", false]
}, "$$ROOT", false
]
}
}
}
}, {
$project: {
parent: {
$setDifference: ["$parent", [false]]
},
children: {
$setDifference: ["$children", [false]]
}
}
}, {
$unwind: "$parent"
}
])
This results in every parent being returned where the parent field contains the whole parent document and the children field returning either an empty array if the parent has no children or an array of child documents.
{
_id: PARENT_ID
parent: PARENT_OBJECT
children: [CHILD_OBJECTS]
}
Upvotes: 0
Reputation: 151122
For the sort of "projection" you are asking for then the aggregation framework is the correct tool as this sort of "document re-shaping" is only really supported there.
The other case is the "parent/child" thing, where you again need to be "creative" when grouping using the aggregation framework. The full operations show what is essentially involved:
db.collection.aggregate([
// Group parent and children together with conditionals
{ "$group": {
"_id": { "$ifNull": [ "$metadata.parent", "$_id" ] },
"metadata": {
"$addToSet": {
"$cond": [
{ "$ifNull": [ "$metadata.parent", false ] },
false,
"$metadata"
]
}
},
"children": {
"$push": {
"$cond": [
{ "$ifNull": [ "$metadata.parent", false ] },
"$$ROOT",
false
]
}
}
}},
// Filter out "false" values
{ "$project": {
"metadata": { "$setDifference": [ "$metadata", [false] ] },
"children": { "$setDifference": [ "$children", [false] ] }
}},
// metadata is an array but should only have one item
{ "$unwind": "$metadata" },
// This is essentially sorting the children as "sets" are un-ordered
{ "$unwind": "$children" },
{ "$sort": { "_id": 1, "children._id": 1 } },
{ "$group": {
"_id": "$_id",
"metadata": { "$first": "$metadata" },
"children": { "$push": "$children" }
}}
])
The main thing here is the $ifNull
operator used on the grouping _id
. This will choose to $group
on the "parent" field where present, otherwise using the general document _id
.
Similar things are done with the $cond
operator later where the evaluation is made of which data to add to the array or "set". In the following $project
the false
values are filtered out by use of the $setDifference
operator.
If the final $sort
and $group
there seem confusing, then the actual reason is because the operator used is a "set" operator the resulting "set" is considered to be un-ordered. So really that part is just there to make sure that the array contents appear in order of their own _id
field.
Without the additional operators from MongoDB 2.6 this can still be done, but just a little differently.
db.collection.aggregate([
{ "$group": {
"_id": { "$ifNull": [ "$metadata.parent", "$_id" ] },
"metadata": {
"$addToSet": {
"$cond": [
{ "$ifNull": [ "$metadata.parent", false ] },
false,
"$metadata"
]
}
},
"children": {
"$push": {
"$cond": [
{ "$ifNull": [ "$metadata.parent", false ] },
{ "_id": "$_id","metadata": "$metadata" },
false
]
}
}
}},
{ "$unwind": "$metadata" },
{ "$match": { "metadata": { "$ne": false } } },
{ "$unwind": "$children" },
{ "$match": { "children": { "$ne": false } } },
{ "$sort": { "_id": 1, "children._id": 1 } },
{ "$group": {
"_id": "$_id",
"metadata": { "$first": "$metadata" },
"children": { "$push": "$children" }
}}
])
Essentially the same thing but without the newer operators introduced in MongoDB 2.6, so this would work in earlier versions as well.
This will all be fine as long as your relationships are a single level of parent and child. For nested levels you would need to invoke a mapReduce process instead.
Upvotes: 4