Scott
Scott

Reputation: 6736

Aggregating a collection in mongo to determine counts

I have a collection in MongoDB that contains several fields, and is written to by my application server every time a user performs an action. A sort of log-collection. Each document resembles the following pattern, omitting a few irrelevant fields for the sake of brevity:

{
        "_id" : ObjectId("5536727686e09bab576f1c94"),
        "_user" : ObjectId("5536597b60b2321aadeb8a7c"),
        "_userAgent" : "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:36.0) Gecko/20100101 Firefox/36.0",
        "_systemTime" : ISODate("2015-04-21T15:53:26.272Z")
        ...
}

I ran some stress tests this week on Tuesday, March 21st, and Wednesday, March 22nd. I would like to form a query that can determine how many distinct users have entries in this collection for either day. What would be a good way of getting a count of how many distinct users have at least one entry on Tuesday, March 21st, and how many distinct users have at least one entry on Wednesday, March 22nd? Keeping the dates in zulu is sufficient.

Upvotes: 1

Views: 71

Answers (3)

nickmilon
nickmilon

Reputation: 1372

what about this ?

db.your_collection.aggregate([{'$project': {'_id': {'yymmdd': {'$dateToString': {'date': '$_systemTime', 'format': '%Y-%m-%d'}}, 'user': '$_user'}}}, {'$group': {'count': {'$sum': 1}, '_id': '$_id'}}])

(*) may be you want to enter a $match range clouse in the beginning of the pipeline if you just need to restrict it to certain dates.

Upvotes: 0

Markus W Mahlberg
Markus W Mahlberg

Reputation: 20712

So, if I get you right, you want distinct users which visited on either of those days?

db.collection.aggregate(
  { $match: 
    { "_systemTime": 
      {
        $gte: ISODate("2015-03-21T00:00:00"),
        $lt: ISODate("2015-03-23T00:00:00")
      }
    }
  },
  { $group:
    {
      _id: { "_user":"$_user", day: { $dayOfMonth: "$_systemTime" } },
      visits: { $sum: 1 }
    }
  },
  { $out: "usersPerDay"}
)

However, this aggregation only really makes sense when you add some stuff to the $group phase, for example a visit count, as per example.

Now you can query usersPerDay to get the information you want:

db.usersPerDay.find({"_id.day":21}).count()

When you adapt the aggregation's $group phase shown above, the aggregation should be capable of giving you basically any time base stat.

Upvotes: 0

Salvador Dali
Salvador Dali

Reputation: 222889

I think you do not need aggregate here. You can achieve what you want with a distinct query together with length.

db.coll.distinct('_user', {
    _systemTime: {
        $gte: ISODate(some time),
        $lt: ISODate(some other time)
    }
}).length

Check out more use cases of distinct, not a lot of people know about this method.

Upvotes: 2

Related Questions