Bonzo
Bonzo

Reputation: 443

Find latest data of a particular fields in mongoDB

I have below set for data in my MongoDB collections. I need to find the latest data based on field "eventType".

{
        "_id" : ObjectId("5d5690843248b8c20481f5e9"),
        "mrn" : "xp35",
        "eventType" : "LAB",
        "eventSubType" : "CBC",
        "value" : 1,
        "units" : 1,
        "charttime" : ISODate("2019-08-16T16:46:21.393Z")
}
{
        "_id" : ObjectId("5d5690843248b8c20481f5e9"),
        "mrn" : "xp35",
        "eventType" : "LAB",
        "eventSubType" : "CBB",
        "value" : 1,
        "units" : 1,
        "charttime" : ISODate("2019-08-16T16:46:22.393Z")
}
{
        "_id" : ObjectId("5d5690843248b8c20481f5ea"),
        "mrn" : "zfwy",
        "eventType" : "EDLIST",
        "eventSubType" : "Lipids",
        "value" : 1,
        "units" : 1,
        "charttime" : ISODate("2019-08-16T16:46:23.394Z")
}
{
        "_id" : ObjectId("5d5690843248b8c20481f5ea"),
        "mrn" : "zfwy",
        "eventType" : "EDLIST",
        "eventSubType" : "L",
        "value" : 1,
        "units" : 1,
        "charttime" : ISODate("2019-08-16T16:46:24.394Z")
}

I used 'aggregation' and 'find' queries and sorted it based on timestamp field "charttime" to fetch the latest data but it is not working. I need to fetch data based on field "eventType" so that for each 'eventType' I should get the latest data. So in the given example, I should get the latest data for "LAB" and "EDLIST". Ideally, it should return data:

{
        "_id" : ObjectId("5d5690843248b8c20481f5e9"),
        "mrn" : "xp35",
        "eventType" : "LAB",
        "eventSubType" : "CBB",
        "value" : 1,
        "units" : 1,
        "charttime" : ISODate("2019-08-16T16:46:22.393Z")
}
{
        "_id" : ObjectId("5d5690843248b8c20481f5ea"),
        "mrn" : "zfwy",
        "eventType" : "EDLIST",
        "eventSubType" : "L",
        "value" : 1,
        "units" : 1,
        "charttime" : ISODate("2019-08-16T16:46:24.394Z")
}

Upvotes: 3

Views: 135

Answers (2)

ambianBeing
ambianBeing

Reputation: 3529

  • First sort(descending) the data by charttime so that $first accumulator works properly.
  • Then group by eventType and find latest of the dates by $maxaccumulator.
  • $project pipe is to retain the original _id with the same key name field. If it is not required as _id you can remove the pipe altogether.

Aggregation Query:

db.collection.aggregate([
  { $sort: { charttime: -1 } },
  {
    $group: {
      _id: "$eventType",
      id: { $first: "$_id" },
      mrn: { $first: "$mrn" },
      eventType: { $first: "$eventType" },
      eventSubType: { $first: "$eventSubType" },
      value: { $first: "$value" },
      units: { $first: "$units" },
      charttime: { $max: "$charttime" }
    }
  },
  {
    $project: {
      _id: "$id",
      mrn: 1,
      eventType: 1,
      eventSubType: 1,
      value: 1,
      units: 1,
      charttime: 1
    }
  }
]);

Upvotes: 0

Hardik Shah
Hardik Shah

Reputation: 4200

Follow below steps:

  • Sort all document first.
  • Group it by eventtype.
  • Project again to get id correctly into _id (Not necessary if you are ok with id key)
  • Sort again those data (Not necessary if you are ok with different eventype not sorted by date)
db.collection.aggregate([
  { $sort: {"charttime": 1 }},
  { $group: {
      _id: "$eventType",
      id: {$first: "$_id"},
      "mrn": {$first: "$mrn"},
      "eventType": {$first: "$eventType"},
      "eventSubType": {$first: "$eventSubType"},
      "value": {$first: "$value"},
      "units": {$first: "$units"},
      "charttime": {$first: "$charttime"}
  }},
  {$project: {
      _id: "$id",
      "mrn": 1,
      "eventType": 1,
      "eventSubType": 1,
      "value": 1,
      "units": 1,
      "charttime": 1
  }},
  { $sort: {"charttime": 1 }}
])

Hope this help!

Output:

/* 1 */
{
    "_id" : ObjectId("5d5cedb1fc18699f18a24fa2"),
    "mrn" : "xp35",
    "eventType" : "LAB",
    "eventSubType" : "CBB",
    "value" : 1,
    "units" : 1,
    "charttime" : ISODate("2019-08-16T16:46:22.393Z")
}

/* 2 */
{
    "_id" : ObjectId("5d5cedc1fc18699f18a24fa9"),
    "mrn" : "zfwy",
    "eventType" : "EDLIST",
    "eventSubType" : "L",
    "value" : 1,
    "units" : 1,
    "charttime" : ISODate("2019-08-16T16:46:24.394Z")
}

===== UPDATE =====

As per your ask to optimize query:

db.collection.aggregate([
  { $sort: {"charttime": -1 }}, // Sort in descending. (So we would not have another sort after group)
  { $group: { 
      _id: "$eventType", // Group by event type
      data: {$first: "$$ROOT"} // Take whole first record
  }},
  { $replaceRoot: { newRoot: "$data" }} // Replaceroot to have document as per your requirement
])

===== UPDATE 2 ====

For too many records: - Find eventType and maximum chartTime - Iterate on each document and get records (You may have multiple calls on DB but it will take less time)

db.getCollection('Vehicle').aggregate([
  { $group: { 
      _id: "$eventType", // Group by event type
      maxChartTime: {$max: "$charttime"}
  }}
]).forEach(function(data) {
    db.getCollection('Vehicle').find({
      "eventType": data._id,
      "charttime": data.maxChartTime
    });
    // Any mechanism to have array of all retrieved documents.
    // You can handle it from your back end too.
})

Note:- I have tested it with 506983 records and got results in 0.526 sec.

Upvotes: 3

Related Questions