Reputation: 1165
I've got around 100 documents that look like these:
{
"_id": "p1",
"when": "2016-06-01T00:00:00.000Z",
"users": [
"u1",
"u2"
]
},
{
"_id": "p2",
"when": "2016-06-01T00:00:00.000Z",
"users": [
"u1",
"u3"
]
},
{
"_id": "p3",
"when": "2016-06-01T00:00:00.000Z",
"users": [
"u1",
"u2",
"u3"
]
}
I want to run a query that will return all the user ids that exist in exactly 3 documents between two dates.
So for example, using the above data and for the dates 2016-04-01 to 2016-07-01 it would return u1 (u1 exists in 3 records between those dates).
I presume it'll use some sort of group by or count and the aggregation framework, but I'm looking at the most performant / efficient way of doing it so I'm open to suggestions. Bear in mind that although there may only be 100 documents there could be 1 million users (so the user arrays in each document could be huge).
Any thoughts?
Upvotes: 0
Views: 33
Reputation: 16805
you can try this query
db.getCollection('collectionName').aggregate([
{$match:{$and:[{"when": { "$gte": '2016-06-01T00:00:00.000Z' }},{"when": { "$lte": '2016-06-01T00:00:00.000Z' }}]}},
{$unwind: "$users"},
{$group:{_id:"$users", count:{$sum:1}}},
{$match:{"count":{"$eq":3}}}
])
in this query I have followed some steps
- Filter record between dates
- Count how many times exist each user
- Filter to get exact 3 times exist users
Upvotes: 1