leoce
leoce

Reputation: 735

mongodb find matches based on count aggregation

I have a mongodb collection like this:

{"uid": "01370mask4",
 "title": "hidden",
 "post: "hidden",
 "postTime": "01-23, 2016",
 "unixPostTime": "1453538601",
 "upvote": [2, 3]}

and I'd like to select post records from the users with more than 5 posts. The stucture should be the same, I just don't need the documents from users who don't have many posts.

db.collection.aggregate(
   [
     { $group : { _id : "$uid", count: { $sum: 1 } } }
   ]
)

Now I'm stuck at how to use the count values to find. I searched but didn't find any methods to add the count values back to the same collection by uid. Saving the aggregation output and joining them together seems not supported by mongodb. Please advise, thanks!

Update:

Sorry that I didn't make it clear earlier. Thanks for your prompt answers! I want a subset of the original collection, with post text, post timestamp, etc. I don't want a subset of the aggregation output.

Upvotes: 1

Views: 12990

Answers (3)

Sarath Nair
Sarath Nair

Reputation: 2868

If there aren't millions of documents, then you can try a shortcut way to achieve what you are trying using one aggregate and another find query,

Aggregate query:

var users = db.collection.aggregate(
  [
    {$group:{_id:'$uid', count:{$sum:1}}},
    {$match:{count:{$gt:5}}},
    {$group:{_id:null,users:{$push:'$_id'}}}
  ]
).toArray()[0]['users']

Then it's a straight ahead query to find the particular users:

db.collection.find({uid: {$in: users}})

Upvotes: 2

zangw
zangw

Reputation: 48396

Please try this one to select users with more than 5 posts. To keep the original fields through using $first, if the $uid is unique, please add the field as below.

db.collection.aggregate([
     {$group: {
          _id: '$uid', 
          title: {$first: '$title'}, 
          post: {$first:'$post'}, 
          postTime:{$first: '$postTime'}, 
          unixPostTime:{$first: '$unixPostTime'},
          upvote:{$first: '$upvote'}, 
          count: {$sum: 1}
     }}, 
     {$match: {count: {$gte: 5}}}])
)

If there are multiple value for the same $uid, you should use $push to an array in the $group.


If you want to save the result to db, please try it as below

var cur = db.collection.aggregate(
   [
     {$group: {
          _id: '$uid', 
          title: {$first: '$title'}, 
          post: {$first:'$post'}, 
          postTime:{$first: '$postTime'}, 
          unixPostTime:{$first: '$unixPostTime'},
          upvote:{$first: '$upvote'}, 
          count: {$sum: 1}
     }}, 
     {$match: {count: {$gte: 5}}}
   ]
)
cur.forEach(function(doc) {
   db.collectioin.update({_id: doc._id}, {/*the field should be updated */});
});

Upvotes: 1

throrin19
throrin19

Reputation: 18197

Just add the $match after your group with the correct query and it works :

db.collection.aggregate(
  [
    { $group : { _id : "$uid", count: { $sum: 1 } } },
    { $match : { count : { $gt : 5 } }
  ]
)

Upvotes: 2

Related Questions