Base Starr
Base Starr

Reputation: 55

Group documents by subdocument field

I am trying to use mongo's aggregation framework to group a collection based on a timestamp and use the $out to output it to a new collection. Apologies, I am new to Mongo

I have the following JSON structure in my collection

{
    "_id" : "1",
    "parent" : [
        {
            "child" : {
                "child_id" : "1",
                "timestamp" : ISODate("2010-01-08T17:49:39.814Z")
            }
        }
    ]
}

Here is what I have been trying

db.mycollection.aggregate([
        { $project: { child_id: '$parent.child.child_id', timestamp: '$parent.child.timestamp' }},
        { $group: { cid: '$child_id', ts: { $max: '$timestmap'} }},
        { $out : 'mycollectiongrouped'}
        ]))

however getting this error. Any ideas, I assume I am probably using the project incorrectly.

[thread1] Error: command failed: { "ok" : 0, "errmsg" : "the group aggregate field 'cid' must be defined as an expression inside an object", "code" : 15951 } : aggregate failed : _getErrorWithCode@src/mongo/shell/utils.js:25:13

Upvotes: 1

Views: 1182

Answers (2)

Sede
Sede

Reputation: 61293

db.collection.aggregate([
    {$group: { 
        _id: "$parent.child.child_id",
        timestamp: {$max: "$parent.child.timestamp"}
    }},
    {$project: {
        cid: {$arrayElemAt: ["$_id", 0]},
        ts: {$arrayElemAt: ["$timestamp", 0]},
        _id: 0
    }},
    {$out: "groupedCollection" }
])

You are missing the _id which is mandatory for the $group pipeline stage. That being said since the "parent" field in your document is one element array, the $group stage should be the first stage in the pipeline.

By making the $group stage the first stage, you will only need to project one document per group instead of all documents in the collection.

Note that the resulted document fields are array hence the use of the $arrayElemAt operator in the $project stage.

Upvotes: 1

B. Fleming
B. Fleming

Reputation: 7230

You need an _id field for the $group. This _id is what determines which documents are grouped together. For instance, if you want to group by child_id, then do _id: "$child_id". In that case, you can omit the cid field (in this case, you can just change cid to _id).

Upvotes: 0

Related Questions