Chris
Chris

Reputation: 3715

Find latest entries in array over all documents

I have multiple documents which contain an array with objects. How can I find the latest entries in the array but over all documents. As an example:

{
  doctitle: “some title”,
  conditions: [
    {
      createdAt: ISODate("2014-12-21T13:59:26Z"),
      title: “some title 1”
    }, 
    {
      createdAt: ISODate("2014-12-22T13:59:26Z"),
      title: “some title 2”
    }
  ]
},
{
  doctitle: “other title”,
  conditions: [
    {
      createdAt: ISODate("2014-12-20T13:59:26Z"),
      title: “some title 3”
    }, 
    {
      createdAt: ISODate("2014-12-23T13:59:26Z"),
      title: “some title 4”
    }
  ]
}

I would like to find the latest 3 conditions. It should output each document just with the docTitle and the one condition found.

{
  docTitle: "other title" ,
  condition: [ 
      {
      createdAt: ISODate("2014-12-23T13:59:26Z"),
      title: “some title 4”
     }
  ]
},
{
  docTitle: "some title" ,
  condition: [ 
      {
      createdAt: ISODate("2014-12-22T13:59:26Z"),
      title: “some title 2”
     }
  ]
},
{
  docTitle: "some title" ,
  condition: [ 
      {
      createdAt: ISODate("2014-12-21T13:59:26Z"),
      title: “some title 1”
     }
  ]
}

Upvotes: 2

Views: 30

Answers (1)

chridam
chridam

Reputation: 103365

Use the following aggregation pipeline which does an initial $unwind operator to "denormalise" the documents by the conditions array field so that each output document replaces the array with an element value. For each input document, outputs n documents where n is the number of array elements and can be zero for an empty array.

The next pipeline step uses the $sort operator to re-order the documents by the embedded createdAt field which is necessary for the next couple of pipeline steps, i.e. the $limit and $group operators.

Since you require only the top three documents ordered by the createdAt field descending, $limit does just that for you by passing the first 3 documents unmodified to the pipeline where the specified limit is 3.

The preceding $group pipeline stage is where you derive the "doctitle" field through the use of the $first and modified conditions array through $push accumulator operators when you group the documents by the createdAt field as the key.

The last pipeline step $project removes the _id field from the previous pipeline stream thus the final pipeline would look like this:

db.collection.aggregate([
    { "$unwind": "$conditions" },
    { "$sort": { "conditions.createdAt": -1 }  },
    { "$limit": 3 },
    {
        "$group": {
            "_id": "$conditions.createdAt",
            "doctitle" : { "$first": "$doctitle" },
            "conditions": { "$push": "$conditions" }
        }
    },
    {
        "$project": {
            "_id": 0, "doctitle": 1, "conditions": 1
        }
    }
])

Sample Output:

/* 0 */
{
    "result" : [ 
        {
            "doctitle" : "some title",
            "conditions" : [ 
                {
                    "createdAt" : ISODate("2014-12-21T13:59:26.000Z"),
                    "title" : "some title 1"
                }
            ]
        }, 
        {
            "doctitle" : "some title",
            "conditions" : [ 
                {
                    "createdAt" : ISODate("2014-12-22T13:59:26.000Z"),
                    "title" : "some title 2"
                }
            ]
        }, 
        {
            "doctitle" : "other title",
            "conditions" : [ 
                {
                    "createdAt" : ISODate("2014-12-23T13:59:26.000Z"),
                    "title" : "some title 4"
                }
            ]
        }
    ],
    "ok" : 1
}

Upvotes: 2

Related Questions