Reputation: 303
Hey guys I have a database that is updated every few hours but I'm having a hard time trying to query the most recent data from each day from a date range.
My database structure looks like this
{
"_id" : ObjectId("5b96787ebe9d44528eaa18a0"),
"values" : [{}],
"user": "user10",
"date" : "2018-09-09 00:00:00",
"type" : "patient",
"extracted_date" : "2018-09-10 10:58:18"
}
I was able to get close to the behaivor that I want by grouping date and getting the last extracted date, like this:
{ "_id" : "2018-09-15 00:00:00", "extracted_date" : "2018-09-19 13:50:22" }
{ "_id" : "2018-09-16 00:00:00", "extracted_date" : "2018-09-19 13:47:26" }
{ "_id" : "2018-09-17 00:00:00", "extracted_date" : "2018-09-19 13:45:00" }
{ "_id" : "2018-09-11 00:00:00", "extracted_date" : "2018-09-12 10:09:17" }
{ "_id" : "2018-09-12 00:00:00", "extracted_date" : "2018-09-14 15:34:59" }
{ "_id" : "2018-09-14 00:00:00", "extracted_date" : "2018-09-19 13:54:34" }
{ "_id" : "2018-09-13 00:00:00", "extracted_date" : "2018-09-14 15:36:10" }
{ "_id" : "2018-09-18 00:00:00", "extracted_date" : "2018-09-19 13:42:23" }
But when I group the data I end up getting all the values from that day and I only need the last one. Here's an example of the query I used:
db.collection.aggregate({'$match': {'type': 'user', 'date': {'$gte': '2018-09-11 00:00:00', '$lte': '2018-09-18 00:00:00'}}}, {'$group': {'_id': {'type': '$type', 'user': '$user', 'date': '$date'}, 'extracted_date': {'$last': '$extracted_date'}, 'values': {'$push': '$values'}}})
If possile I would like to retrive the information as close to the structure utilized by the database.
Thank you very much for your help!
Edit: This is a case example that I need.
database objects:
{
"_id" : ObjectId("5b96787ebe9d44528eaa18a0"),
"values" : [{'field1': 1, 'field2': 3}],
"user": "user10",
"date" : "2018-09-09 00:00:00",
"type" : "patient",
"extracted_date" : "2018-09-10 10:58:18"
}
{
"_id" : ObjectId("5b96787ebe9d44528eaa18a0"),
"values" : [{'field1': 1, 'field2': 4}],
"user": "user10",
"date" : "2018-09-09 00:00:00",
"type" : "patient",
"extracted_date" : "2018-09-11 10:58:18"
}
{
"_id" : ObjectId("5b96787ebe9d44528eaa18a0"),
"values" : [{'field11': 2, 'field2': 10}],
"user": "user11",
"date" : "2018-09-05 00:00:00",
"type" : "patient",
"extracted_date" : "2018-09-10 10:58:18"
}
Expected return:
{
"_id" : ObjectId("5b96787ebe9d44528eaa18a0"),
"values" : [{'field1': 1, 'field2': 4}],
"user": "user10",
"date" : "2018-09-09 00:00:00",
"type" : "patient",
"extracted_date" : "2018-09-11 10:58:18"
}
{
"_id" : ObjectId("5b96787ebe9d44528eaa18a0"),
"values" : [{'field11': 2, 'field2': 10}],
"user": "user11",
"date" : "2018-09-05 00:00:00",
"type" : "patient",
"extracted_date" : "2018-09-10 10:58:18"
}
Since there's 2 objected from the same date it only returns the one with the most recent extracted_date
Upvotes: 1
Views: 87
Reputation: 18525
If I understand you correctly you want to just get the last
values
array and not all of them for that day combined ...
So just do $last
for the values like you did for extracted_date
.
UPDATE:
Since you are looking for the most recent data for that range you need to use the sort
as per matthPen
suggestion and just get the needed fields out of the _id
and then hide the _id
:
db.collection.aggregate([{
"$match": {
"type": "patient",
"date": {
"$gte": "2018-09-05 00:00:00",
"$lte": "2018-09-18 00:00:00"
}
}
},
{
"$sort": {
"extracted_date": 1
}
},
{
"$group": {
"_id": {
"type": "$type",
"user": "$user",
"date": "$date"
},
"id": {
$last: "$_id"
},
"date": {
$last: "$date"
},
"type": {
$last: "$type"
},
"extracted_date": {
$last: "$extracted_date"
},
"values": {
$last: "$values"
}
}
},
{
"$project": {
"_id": 0
}
}
])
You can see it here
Upvotes: 1
Reputation: 4363
Think you are confusing between 'the last per day' (the one which appened the later in each day) and $last (the last coming from previous stage in pipeline)!
You need to add a sort stage before grouping, to ensure that $last is 'the last'.
db.collection.aggregate({
"$match": {
"type": "user",
"date": {
"$gte": "2018-09-11 00:00:00",
"$lte": "2018-09-18 00:00:00"
}
}
},
{$sort:{
date:1,
extracted_date:1
}
},
{
"$group": {
"_id": {
"type": "$type",
"user": "$user",
"date": "$date"
},
"extracted_date": {
"$last": "$extracted_date"
},
"values": {
"$last": "$values"
}
}
})
Upvotes: 0