Reputation: 1428
I have a collection
{
"name" : "foo"
"clicked" : {"0":6723,"1": 1415,"2":1122}
}
{
"name" : "bar"
"clicked" : {"8":1423,"9": 1415,"10":1122}
}
{
"name" : "xyz"
"clicked" : {"22":6723,"23": 1415,"2":1234}
}
clicked is basically {"position of item-clicked in the list" : "id of the item"}
The final output which I want is the total number of times an item has been clicked i.e the following for the above sample:
{
6723:2,
1415:3,
1423:1,
1122:2,
1234:1
}
One way to do this by maintaining an in memory dict (in python script) and looking up "clicked"
field in each document to update the dict.
I'm new to mongo please help!
Upvotes: 2
Views: 489
Reputation: 1428
I was finally able to build a map-reduce aggregation to do get my job done, without changing the schema.
var map_function = function(){
for( x in this.clicked){
var key = this.clicked[x];
emit(key,1);
}
};
var reduce_function = function(a,b){
return Array.sum(b);
};
db.imp.mapReduce( map_function, reduce_function,"id").find()
Upvotes: 1
Reputation: 103365
If you could do away with the present schema and redesign it in such a way that clicked is an array that has key-value pair as its elements, then you could apply the aggregation framework to achieve the desired result.
In Mongo, you can convert the schema by iterating over the documents using the forEach() method of the find()
cursor and updating the clicked field with an array of key-value pair objects:
db.collection.find().forEach(function (doc){
var obj = {},
keys = Object.keys(doc.clicked),
clicked = keys.map(function (key){
obj.position = parseInt(key);
obj.elementId = doc.clicked[key]
return obj;
});
doc.clicked = clicked;
db.collection.save(doc);
});
After changing the schema using the above, your documents would have the following structure:
{
"name": "foo",
"clicked": [
{ "position": 0, "elementId": 6723 },
{ "position": 1, "elementId": 1415 },
{ "position": 2, "elementId": 1122 }
]
},
{
"name": "bar",
"clicked": [
{ "position": 8, "elementId": 1423 },
{ "position": 9, "elementId": 1415 },
{ "position": 10, "elementId": 1122 }
]
},
{
"name": "xyz"
"clicked": [
{ "position": 22, "elementId": 6723 },
{ "position": 23, "elementId": 1415 },
{ "position": 2, "elementId": 1234 }
]
}
It would be quite an easy feat to get the desired aggregation by using the aggregation framework. This would entail an aggregation pipeline that consists of an $unwind
and $group
operators, with the $unwind
as its first pipeline step. This deconstructs the clicked
array field from the input documents to output a document for each element. Each output document replaces the array with an element value.
The $group
operator groups the input documents by the specified elementId
identifier/key and applies the accumulator expression $sum
to each group that would give the count of the grouped documents:
var pipeline = [
{
"$unwind": "$clicked"
},
{
"$group": {
"_id": "$clicked.elementId",
"count": {
"$sum": 1
}
}
}
];
db.collection.aggregate(pipeline)
Output
/* 0 */
{
"result" : [
{
"_id" : 1234,
"count" : 1
},
{
"_id" : 1423,
"count" : 1
},
{
"_id" : 1122,
"count" : 2
},
{
"_id" : 1415,
"count" : 3
},
{
"_id" : 6723,
"count" : 2
}
],
"ok" : 1
}
Converting the result to an object you require only takes the map()
method of the aggregation cursor result:
var result = db.test.aggregate(pipeline)
.map(function(doc){ return {doc["_id"]: doc["count"]} });
printjson(result);
Output:
[
{
6723: 2,
1415: 3,
1423: 1,
1122: 2,
1234: 1
}
]
Upvotes: 0
Reputation: 61225
In [58]: import pymongo
In [59]: from collections import Counter
In [61]: conn = pymongo.MongoClient()
In [62]: db = conn.test
In [63]: col = db.collection
In [64]: result = col.aggregate([{"$group": {"_id": None, "clicked": {"$push": "$clicked"}}}]).next()['clicked']
In [65]: c = Counter()
In [66]: for el in [Counter(i.values()) for i in result]:
....: c += el
....:
In [67]: print(dict(c))
{1122: 2, 6723: 2, 1415: 3, 1234: 1, 1423: 1}
Upvotes: 2