Reputation: 3317
You know that there is no join in mongodb, So I execute likes join query like this.
users = user_collection.find({"region": "US", `and some condition here`}, projection={"user_id": 1"})
user_list = [
user['user_id']
for user in users
]
posts = post_collection.find({"user_id": {"$in": user_list}, `and some condition here`)
(To avoid bring unnecessary field, also used projection
option in find()
)
Collection and list size
users = 2000000
user_list = 100000
posts = 2000000
When I execute query, it takes almost 4 seconds.
Among them, make user_list
takes almost 3 seconds.
Question
user_id
efficiently?Thanks.
Upvotes: 1
Views: 76
Reputation: 9285
First, make sure that the fields you query on are properly indexed. If it's already done, you can try this:
you could use distinct
to get the user_list
in one single query:
something like this:
user_list = user_collection.distinct("user_id", {"region": "US", ...})
second option is to retrieve the posts in a single query by performing a $lookup from the user_collection
:
user_collection.aggregate([
{
"$match": {"region": "US", ...}
},
{
"$lookup": {
"from": "post_collection",
"localField": "user_id",
"foreignField": "user_id",
"as": "post"
}
},
...
])
and then filter the posts with a $unwind and a $match stage
Upvotes: 1