Reputation:
db.chat.find().pretty().limit(3)
{
"_id" : ObjectId("593921425ccc8150f35e7662"),
"user1" : 1,
"user2" : 2,
"messages" : [
{
"sender" : 1,
"datetime" : ISODate("2017-06-08T10:04:50Z"),
"body" : "hiii 0"
},
{
"sender" : 2,
"datetime" : ISODate("2017-06-09T10:04:50Z"),
"body" : "hiii 1"
},
{
"sender" : 2,
"datetime" : ISODate("2017-06-10T10:04:50Z"),
"body" : "hiii 2"
}
]
}
{
"_id" : ObjectId("593921425ccc8150f35e7663"),
"user1" : 1,
"user2" : 3,
"messages" : [
{
"sender" : 1,
"datetime" : ISODate("2017-06-08T10:04:50Z"),
"body" : "hiii 0"
},
{
"sender" : 3,
"datetime" : ISODate("2017-06-09T10:04:50Z"),
"body" : "hiii 1"
},
{
"sender" : 1,
"datetime" : ISODate("2017-06-10T10:04:50Z"),
"body" : "hiii 2"
}
]
}
{
"_id" : ObjectId("593921425ccc8150f35e7664"),
"user1" : 1,
"user2" : 4,
"messages" : [
{
"sender" : 1,
"datetime" : ISODate("2017-06-08T10:04:50Z"),
"body" : "hiii 0"
},
{
"sender" : 1,
"datetime" : ISODate("2017-06-09T10:04:50Z"),
"body" : "hiii 1"
},
{
"sender" : 4,
"datetime" : ISODate("2017-06-10T10:04:50Z"),
"body" : "hiii 2"
}
]
}
Above mongodb collection store chats between user1 and user2. I am looking for a query that will give me result of the latest messages where message.sender = 1 of each row.
i.e looking for 3 rows as output
For user1=1 and user2, message hiii0 should only come
For user1=1 and user3, message hiii2 should only come
For user1=1 and user4, message hiii1 should only come
ie. just three rows.
db.chat.find({"messages.sender":1})
is giving all the rows while i am looking for just the matched row with the latest datetime.
Please help
Example:
db.chat.aggregate([
{$unwind:"$messages"},
{$match:{"messages.sender":1}},
{$sort:{"messages.datetime":-1}
])
is giving output as
{ "_id" : ObjectId("593921425ccc8150f35e7663"), "user1" : 1, "user2" : 3, "messages" : { "sender" : 1, "datetime" : ISODate("2017-06-10T10:04:50Z"), "body" : "hiii 2" } }
{ "_id" : ObjectId("593921425ccc8150f35e7664"), "user1" : 1, "user2" : 4, "messages" : { "sender" : 1, "datetime" : ISODate("2017-06-09T10:04:50Z"), "body" : "hiii 1" } }
{ "_id" : ObjectId("593921425ccc8150f35e7662"), "user1" : 1, "user2" : 2, "messages" : { "sender" : 1, "datetime" : ISODate("2017-06-08T10:04:50Z"), "body" : "hiii 0" } }
{ "_id" : ObjectId("593921425ccc8150f35e7663"), "user1" : 1, "user2" : 3, "messages" : { "sender" : 1, "datetime" : ISODate("2017-06-08T10:04:50Z"), "body" : "hiii 0" } }
{ "_id" : ObjectId("593921425ccc8150f35e7664"), "user1" : 1, "user2" : 4, "messages" : { "sender" : 1, "datetime" : ISODate("2017-06-08T10:04:50Z"), "body" : "hiii 0" } }
Last two rows are not desirable as its its not the latest record for user1-user2 record.
If i am adding ,{$limit:1} , its giving just one row.
Upvotes: 0
Views: 1827
Reputation: 151072
The basic concept here is you need the aggregation framework in order to apply conditions to "filter" the array elements to the conditions. Depending on the available version there are different techniques that can be applied.
In all cases this is the result:
{
"_id" : ObjectId("593921425ccc8150f35e7664"),
"user1" : 1,
"user2" : 4,
"messages" : {
"sender" : 1,
"datetime" : ISODate("2017-06-09T10:04:50Z"),
"body" : "hiii 1"
}
}
{
"_id" : ObjectId("593921425ccc8150f35e7663"),
"user1" : 1,
"user2" : 3,
"messages" : {
"sender" : 1,
"datetime" : ISODate("2017-06-10T10:04:50Z"),
"body" : "hiii 2"
}
}
{
"_id" : ObjectId("593921425ccc8150f35e7662"),
"user1" : 1,
"user2" : 2,
"messages" : {
"sender" : 1,
"datetime" : ISODate("2017-06-08T10:04:50Z"),
"body" : "hiii 0"
}
}
db.chat.aggregate([
{ "$match": { "messages.sender": 1 } },
{ "$replaceRoot": {
"newRoot": {
"$let": {
"vars": {
"messages": {
"$filter": {
"input": "$messages",
"as": "m",
"cond": { "$eq": [ "$$m.sender", 1 ] }
}
},
"maxDate": {
"$max": {
"$map": {
"input": {
"$filter": {
"input": "$messages",
"as": "m",
"cond": { "$eq": [ "$$m.sender", 1 ] }
}
},
"as": "m",
"in": "$$m.datetime"
}
}
}
},
"in": {
"_id": "$_id",
"user1": "$user1",
"user2": "$user2",
"messages": {
"$arrayElemAt": [
{ "$filter": {
"input": "$$messages",
"as": "m",
"cond": { "$eq": [ "$$m.datetime", "$$maxDate" ] }
}},
0
]
}
}
}
}
}}
])
This is the most efficient way which takes advantage of $replaceRoot
which allows us to declare variables to use in the "replaced" structure using $let
. The main advantage here is that this requires only "two" pipeline stages.
In order to match the array content you use $filter
where you apply the $eq
logical operation to test the value of "sender"
. Where the condition matches, then only the matching array entries are returned.
Using the same $filter
so that only the matching "sender" entries are considered, we then want to apply $max
over the "filtered" list to the values in "datetime"
. The $max
]5 value is the "latest" date by the conditions.
We want this value so we can later compare the returned results from the "filtered" array to this "maxDate". Which is what happens inside the "in"
block of $let
where the two "variables" declared earlier for the filtered content and the "maxDate" are again applied to $filter
in order to return what should be the only value that met both conditions having the "latest date" as well.
Since you only want "one" result, we use $arrayElemAt
to use the value rather than the array.
db.chat.aggregate([
{ "$match": { "messages.sender": 1 } },
{ "$project": {
"user1": 1,
"user2": 1,
"messages": {
"$filter": {
"input": "$messages",
"as": "m",
"cond": { "$eq": [ "$$m.sender", 1 ] }
}
},
"maxDate": {
"$max": {
"$map": {
"input": {
"$filter": {
"input": "$messages",
"as": "m",
"cond": { "$eq": [ "$$m.sender", 1 ] }
}
},
"as": "m",
"in": "$$m.datetime"
}
}
}
}},
{ "$project": {
"user1": 1,
"user2": 1,
"messages": {
"$arrayElemAt":[
{ "$filter": {
"input": "$messages",
"as": "m",
"cond": { "$eq": [ "$$m.datetime", "$maxDate" ] }
}},
0
]
}
}}
])
This is basically the same process as described, but without the $replaceRoot
pipeline stage, we need to apply in two $project
stages. The reason for this is we need the "calculated value" from "maxDate" in order to do that final $filter
, and it is not available to do in a compound statement, so instead we split the pipelines. This has a small impact on the overall cost of the operation.
In MongoDB 2.6 to 3.0 we can use most of the technique here except for $arrayElemAt
and either accept the "array" result with a single entry or put in an $unwind
stage to deal with what should now be a single entry.
db.chat.aggregate([
{ "$match": { "messages.sender": 1 } },
{ "$unwind": "$messages" },
{ "$match": { "messages.sender": 1 } },
{ "$sort": { "_id": 1, "messages.datetime": -1 } },
{ "$group": {
"_id": "$_id",
"user1": { "$first": "$user1" },
"user2": { "$first": "$user2" },
"messages": { "$first": "$messages" }
}}
])
Whilst it looks brief, this is by far the most costly operation. Here you must use $unwind
in order to apply the conditions to the array elements. This is a very costly process as it produces a copy of each document for each array entry, and is essentially replaced by the modern operators that avoid this in the case of "filtering".
The second $match
stage here discards any elements ( now "documents" ) which did not match the "sender" condition. Then we apply a $sort
in order to put the "latest" date on top for each document by the _id
, hence the two "sort" keys.
Finally we apply $group
in order to just refer to the original document, using $first
as the accumulator to get the element that is "on top".
Upvotes: 1