Reputation: 386
I'm a novice in pymongo/mongodb and now I have a challenge.
I have the following structure stored in mongodb (v 2.04).
{
"t": <timestamp>,
"d": {
"uid": <string>,
"action": <string>
}
}
This structure tracks user actions and is slightly reduced in complexity from my original. The data is pretty huge and the query will have a limiting datespan to reduce the results.
What I would like is to be able to create a table of the users who did the most actions during a certain timespan.
Table:
Rank Uid #num actions
1 5 235
2 237 234
3 574 229
So far I've only bits and pieces of the query:
query = {"t": {"$lte": end_utc, "$gte": start_utc}}
db.actions.find(query).distinct("d.uid")
Which will simply produce a list of unique uid's. How can I query (using pymongo) to get a list such as:
[
{
"actions": 100,
"uid": 273
},
{
"actions": 99",
"uid": 632
}..n sorted on actions descending
]
Upvotes: 2
Views: 680
Reputation: 312115
If you're using MongoDB 2.1+ you can use the aggregation framework for this type of query:
db.actions.aggregate([
# Filter the docs to just those within the specified timerange
{"$match": {"t": {"$lte": end_utc, "$gte": start_utc}}},
# Group the docs on d.uid, assembling a count of action docs with each value
{"$group": {"_id": "$d.uid", "actions": {"$sum": 1}}},
# Sort by actions, descending
{"$sort": { "actions": -1 }}
])
Upvotes: 5