Reputation: 647
I have been working on a MongoDB query that keeps getting more and more complicated. I've opened a couple of threads already, but I feel like it would be better to explain my entire case and adjust the query all at once to fit all my needs, because I feel like I need to change everything as soon as I want to introduce more and more conditions.
So, for my case, I want to create a chat application. Users can have one-on-one conversations, as well as group chats with up to 25 people. The idea behind my database is to keep a conversations
document and a messages
document which are linked to each other using the _id from the conversation. This is my conversations
document:
{
"_id" : ObjectId("5e35f2c840713a43aeeeb3d9"),
"n" : "Example Group Chat",
"members" : [
{
"uID" : "1",
"j" : 1580580922,
"i" : "1",
"r" : "admin",
"a" : 1
},
{
"uID" : "4",
"j" : 1580580922,
"l" : 1581863346,
"i" : "1",
"r" : "member",
"a" : 0
},
{
"uID" : "5",
"j" : 1580581922,
"i" : "1",
"r" : "member",
"a" : 1
},
{
"uID" : "9",
"j" : 1580593922,
"i" : "1",
"r" : "member",
"a" : 1
},
{
"uID" : "3",
"j" : 1580594920,
"i" : "1",
"r" : "member",
"a" : 1
},
{
"uID" : "8",
"j" : 1580594999,
"i" : "1",
"r" : "member",
"a" : 1
}
]
}
{
"_id" : ObjectId("5e39d5d740713a43aeef5b26"),
"members" : [
{
"uID" : "1",
"j" : 1580580922,
"i" : "1",
"r" : "member",
"a" : 1
},
{
"uID" : "2",
"j" : 1580580922,
"i" : "1",
"r" : "member",
"a" : 1
}
]
}
You can see a group chat (first) and a one-on-one chat (second). Group chats can have names (n), one-on-one chats not. Every conversation has an array of members, which stores the userID (uID), the join date (j), the left date (l), an invited by userID field (i), a role field (r) and an active field (a). I probably won't need the "active" field, as I have a join/left timestamp, but still. I'll probably delete it later, so that field probably won't be included.
Next, I have my messages
document as follows:
{
"_id" : ObjectId("5e4917bca59ce44ef2770086"),
"c_ID" : ObjectId("5e35f2c840713a43aeeeb3d9"),
"msg" : "Whats good?",
"fromID" : "1",
"__v" : 0,
"t" : 1582369525,
"d" : {
"4" : 1582369525
},
"r" : {
"4" : 1582369525
}
}
This holds the message itself (msg), the user who sent it (fromID), the timestamp in UNIX Epoch (t) and subcollections of deliveries (d) and reads (r) and of course the conversation ObjectID (c_ID).
I currently have the following query, after much help here on StackOverflow:
db.conversations.aggregate([
{
$lookup: {
from: "messages",
foreignField: "c_ID",
localField: "_id",
as: "messages"
}
},
{
"$unwind": "$messages"
},
{
"$sort": {
"messages.t": -1
}
},
{
"$group": {
"_id": "$_id",
"lastMessage": {
"$first": "$messages"
},
"allFields": {
"$first": "$$ROOT"
}
}
},
{
"$replaceRoot": {
"newRoot": {
"$mergeObjects": [
"$allFields",
{
"lastMessage": "$lastMessage"
}
]
}
}
},
{
$project: {
messages: 0
}
},
{
$match: {
"members.uID": "1"
}
},
{
$sort: {
"lastMessage.t": 1
}
},
{
$limit: 10
},
{
$project: {
members: {
$filter: {
input: "$members",
as: "member",
cond: {
$and: [
{
$ne: [
"$$member.uID",
"1"
]
},
{
$or: [
{
$eq: [
"$$member.l",
undefined
]
},
{
$lt: [
"$$member.l",
"$$member.j"
]
}
]
}
]
}
}
},
n: 1,
lastMessage: 1
}
}
])
I am facing a couple of problems with the above:
l
field is set on the member with a timestamp greater than the join date, the user can still see that conversation, but the last message has to be before the left timestamp. For example: member with userID 4 has left the group chat, but the chat is still visible in his overview, with messages BEFORE his left date (1581863346
)1581863346
)memberCount
field which is just the number of users, based on the same criteria: no left date (undefined) or join date larger than left date, so the retrieve the “actual” number of members currently in the group.Any ideas on how to improve my query to include the two above conditions? Or, also, is there a way to optimize my query for performance? Cheers!
By the way: this might be important, but I guess this makes sense if you've read all the way through: this query is ment to be used in the overview of all your active conversations, sorted by timestamp. What you'll see first if you open Facebook Messenger or WhatsApp, let's say.
EDIT: Based on the answer of @Tunmee I found two more conditions that needed to be applied:
- The lastMessage should be posted not only before the left date (l
), but also after the join date (j
).
- I want to return only the conversations in which a lastMessage is retrieved, so only when there is a chat message to be shown to the user.
I came up with the following query. Does this look good? Any concerns?
db.conversations.aggregate([
{
$match: {
"members.uID": "4"
}
},
{
$addFields: {
user: {
$arrayElemAt: [
{
$filter: {
input: "$members",
as: "member",
cond: {
$eq: [
"$$member.uID",
"4"
]
}
}
},
0
]
}
}
},
{
$lookup: {
from: "messages",
let: {
user: "$user",
conversatoinId: "$_id"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{
$eq: [
"$c_ID",
"$$conversatoinId"
]
},
{
$cond: {
if: {
$lt: [
{
$ifNull: [
"$$user.l",
0
]
},
"$$user.j"
]
},
then: true,
else: {
$lt: [
"$t",
"$$user.l"
]
}
}
},
{
$gt: [
"$t",
"$$user.j"
]
}
]
}
},
},
{
$sort: {
"t": -1
},
}
],
as: "messages"
}
},
{
$project: {
lastMessage: {
$arrayElemAt: [
"$messages",
0
]
},
n: 1,
members: 1
}
},
{
$sort: {
"lastMessage.t": 1
}
},
{
$project: {
members: {
$filter: {
input: "$members",
as: "member",
cond: {
$and: [
{
$ne: [
"$$member.uID",
"4"
]
},
{
$or: [
{
$eq: [
"$$member.l",
undefined
]
},
{
$lt: [
"$$member.l",
"$$member.j"
]
}
]
}
]
}
}
},
memberCount: {
$size: {
$filter: {
input: "$members",
as: "member",
cond: {
$and: [
{
$ne: [
"$$member.uID",
"4"
]
},
{
$or: [
{
$eq: [
"$$member.l",
undefined
]
},
{
$lt: [
"$$member.l",
"$$member.j"
]
}
]
}
]
}
}
}
},
n: 1,
lastMessage: 1
}
},
{
$match: {
lastMessage: {
$exists: true
}
}
},
{
$limit: 10
}
])
EDIT: As far as I can tell, it all looks good at the moment. The only thing I'm missing now is something to catch the following situation:
Imagine there is a group chat with 20 members. The name of the chat is "Funky Fridays". userID 4 joins that group chat (I set the j
(join) field to 1582475543
(timestamp), participates for twee weeks and then leaves (I set the l
(left) field to 1583685143
(timestamp). This will all work fine. However, how can I add userID 4 again to the same group chat 1 week later (timestamp 1584289943
) and make sure userID 4 can see the lastMessage IF it is either between the first time he joined/left OR if it's posted after he joined again?
I would like to be able to add the same user more than once to the members array, but with different j
(and l
) fields, and then query the lastMessage to be between one of them, which would allow for what I want to do as described above.
Upvotes: 1
Views: 1162
Reputation: 2573
Try this:
db.conversations.aggregate([
{
$match: {
"members.uID": "4"
}
},
{
$addFields: {
user: {
$arrayElemAt: [
{
$filter: {
input: "$members",
as: "member",
cond: {
$eq: [
"$$member.uID",
"4"
]
}
}
},
0
]
}
}
},
{
$lookup: {
from: "messages",
let: {
user: "$user",
conversatoinId: "$_id"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{
$eq: [
"$c_ID",
"$$conversatoinId"
]
},
{
$cond: {
if: {
$lt: [
{
$ifNull: [
"$$user.l",
0
]
},
"$$user.j"
]
},
then: true,
else: {
$lt: [
"$t",
"$$user.l"
]
}
}
}
]
}
},
},
{
$sort: {
"t": -1
},
}
],
as: "messages"
}
},
{
$project: {
lastMessage: {
$arrayElemAt: [
"$messages",
0
]
},
n: 1,
members: 1
}
},
{
$match: {
lastMessage: {
$exists: true
}
}
},
{
$sort: {
"lastMessage.t": 1
}
},
{
$limit: 10
},
{
$project: {
members: {
$filter: {
input: "$members",
as: "member",
cond: {
$and: [
{
$ne: [
"$$member.uID",
"1"
]
},
{
$or: [
{
$eq: [
"$$member.l",
undefined
]
},
{
$lt: [
"$$member.l",
"$$member.j"
]
}
]
}
]
}
}
},
n: 1,
lastMessage: 1
}
}
])
Pipeline Explanation:
match
: This matches only the conversations that the user is a part of;addFields
: This adds a "user" field to each conversation document. The user field stores the member document(from the members array) of the particular user we are aggregating data for. lookup
: This is just like the lookup
stage in the pipeline you posted, however, this one uses the lookup#pipeline
and lookup#let
options to add some constraints to the messages that are loaded.lookup#let
: I'm basically declaring variables to use in the lookup#pipeline
. lookup#pipeline#match
: This sub-pipeline stage ensures the message document: (1) has a conversation ID(c_ID
) that matches the id of the conversation document. (2) if the user has left the channel i.e $$user.l
is less than $$user.j
, the message timestamp must be lower than the date the user left the channel. Notice the use of the ifNull
operator to set the $$user.l
field as 0 if it was null i.e the user has not left the group at all.lookup#pipeline#sort
: This sorts the messages that are passed in from the match stage in descending order. This is added so that we can easily get the last message;project
: Reshapes the document to include the lastMessage
field and remove the messages
field.sort
: This is the same as what was in the pipeline you posted.limit
: This is the same as what was in the pipeline you posted.project
: This is the same as what was in the pipeline you posted.Upvotes: 2