Reputation: 443
I have below set for data in my MongoDB collections. I need to find the latest data based on field "eventType".
{
"_id" : ObjectId("5d5690843248b8c20481f5e9"),
"mrn" : "xp35",
"eventType" : "LAB",
"eventSubType" : "CBC",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:21.393Z")
}
{
"_id" : ObjectId("5d5690843248b8c20481f5e9"),
"mrn" : "xp35",
"eventType" : "LAB",
"eventSubType" : "CBB",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:22.393Z")
}
{
"_id" : ObjectId("5d5690843248b8c20481f5ea"),
"mrn" : "zfwy",
"eventType" : "EDLIST",
"eventSubType" : "Lipids",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:23.394Z")
}
{
"_id" : ObjectId("5d5690843248b8c20481f5ea"),
"mrn" : "zfwy",
"eventType" : "EDLIST",
"eventSubType" : "L",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:24.394Z")
}
I used 'aggregation' and 'find' queries and sorted it based on timestamp field "charttime" to fetch the latest data but it is not working. I need to fetch data based on field "eventType" so that for each 'eventType' I should get the latest data. So in the given example, I should get the latest data for "LAB" and "EDLIST". Ideally, it should return data:
{
"_id" : ObjectId("5d5690843248b8c20481f5e9"),
"mrn" : "xp35",
"eventType" : "LAB",
"eventSubType" : "CBB",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:22.393Z")
}
{
"_id" : ObjectId("5d5690843248b8c20481f5ea"),
"mrn" : "zfwy",
"eventType" : "EDLIST",
"eventSubType" : "L",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:24.394Z")
}
Upvotes: 3
Views: 135
Reputation: 3529
charttime
so that $first
accumulator works properly.eventType
and find latest of the dates by
$max
accumulator.$project
pipe is to retain the original _id
with the same key
name field. If it is not required as _id
you can remove the pipe
altogether.Aggregation Query:
db.collection.aggregate([
{ $sort: { charttime: -1 } },
{
$group: {
_id: "$eventType",
id: { $first: "$_id" },
mrn: { $first: "$mrn" },
eventType: { $first: "$eventType" },
eventSubType: { $first: "$eventSubType" },
value: { $first: "$value" },
units: { $first: "$units" },
charttime: { $max: "$charttime" }
}
},
{
$project: {
_id: "$id",
mrn: 1,
eventType: 1,
eventSubType: 1,
value: 1,
units: 1,
charttime: 1
}
}
]);
Upvotes: 0
Reputation: 4200
Follow below steps:
eventtype
.id
correctly into _id
(Not necessary if you are ok with id
key)eventype
not sorted by date)db.collection.aggregate([
{ $sort: {"charttime": 1 }},
{ $group: {
_id: "$eventType",
id: {$first: "$_id"},
"mrn": {$first: "$mrn"},
"eventType": {$first: "$eventType"},
"eventSubType": {$first: "$eventSubType"},
"value": {$first: "$value"},
"units": {$first: "$units"},
"charttime": {$first: "$charttime"}
}},
{$project: {
_id: "$id",
"mrn": 1,
"eventType": 1,
"eventSubType": 1,
"value": 1,
"units": 1,
"charttime": 1
}},
{ $sort: {"charttime": 1 }}
])
Hope this help!
Output:
/* 1 */
{
"_id" : ObjectId("5d5cedb1fc18699f18a24fa2"),
"mrn" : "xp35",
"eventType" : "LAB",
"eventSubType" : "CBB",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:22.393Z")
}
/* 2 */
{
"_id" : ObjectId("5d5cedc1fc18699f18a24fa9"),
"mrn" : "zfwy",
"eventType" : "EDLIST",
"eventSubType" : "L",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:24.394Z")
}
As per your ask to optimize query:
db.collection.aggregate([
{ $sort: {"charttime": -1 }}, // Sort in descending. (So we would not have another sort after group)
{ $group: {
_id: "$eventType", // Group by event type
data: {$first: "$$ROOT"} // Take whole first record
}},
{ $replaceRoot: { newRoot: "$data" }} // Replaceroot to have document as per your requirement
])
For too many records: - Find eventType and maximum chartTime - Iterate on each document and get records (You may have multiple calls on DB but it will take less time)
db.getCollection('Vehicle').aggregate([
{ $group: {
_id: "$eventType", // Group by event type
maxChartTime: {$max: "$charttime"}
}}
]).forEach(function(data) {
db.getCollection('Vehicle').find({
"eventType": data._id,
"charttime": data.maxChartTime
});
// Any mechanism to have array of all retrieved documents.
// You can handle it from your back end too.
})
Note:- I have tested it with
506983
records and got results in0.526
sec.
Upvotes: 3