Reputation: 1863
I have mongodb collection whose structure is as follows :-
{
"_id" : "mongo",
"log" : [
{
"ts" : ISODate("2011-02-10T01:20:49Z"),
"visitorId" : "25850661"
},
{
"ts" : ISODate("2014-11-01T14:35:05Z"),
"visitorId" : NumberLong(278571823)
},
{
"ts" : ISODate("2014-11-01T14:37:56Z"),
"visitorId" : NumberLong(0)
},
{
"ts" : ISODate("2014-11-04T06:23:48Z"),
"visitorId" : NumberLong(225200092)
},
{
"ts" : ISODate("2014-11-04T06:25:44Z"),
"visitorId" : NumberLong(225200092)
}
],
"uts" : ISODate("2014-11-04T06:25:43.740Z")
}
"mongo" is a search term and "ts" indicates the timestamp when it was searched on website.
"uts" indicates the last time it was searched.
So search term "mongo" was searched 5 times on our website.
I need to get top 50 most searched items in past 3 months.
I am no expert in aggregation in mongodb, but i was trying something like this to atleast get data of past 3 months: -
db.collection.aggregate({$group:{_id:"$_id",count:{$sum:1}}},{$match:{"log.ts":{"$gte":new Date("2014-09-01")}}})
It gave me error :-
exception: sharded pipeline failed on shard DSink9: { errmsg: \"exception: aggregation result exceeds maximum document size (16MB)\", code: 16389
Can anyone please help me?
UPDATE
I was able to write some query. But it gives me syntax error.
db.collection.aggregate(
{$unwind:"$log"},
{$project:{log:"$log.ts"}},
{$match:{log:{"$gte" : new Date("2014-09-01"),"$lt" : new Date("2014-11-04")}}},
{$project:{_id:{val:{"$_id"}}}},
{$group:{_id:"$_id",sum:{$sum:1}}})
Upvotes: 0
Views: 2226
Reputation: 151170
You are exceeding a maximum document size in a result, but generally that is an indication that you are "doing it wrong", particularly given your example term of searching for "mongo" in your stored data between two dates:
db.collection.aggregate([
// Always match first, it reduces the workload and can use an index here only.
{ "$match": {
"_id": "mongo"
"log.ts": {
"$gte": new Date("2014-09-01"), "$lt": new Date("2014-11-04")
}
}},
// Unwind the array to de-normalize as documents
{ "$unwind": "$log" },
// Get the count within the range, so match first to "filter"
{ "$match": {
"log.ts": {
"$gte": new Date("2014-09-01"), "$lt": new Date("2014-11-04")
}
}},
// Group the count on `_id`
{ "$group": {
"_id": "$_id",
"count": { "$sum": 1 }
}}
]);
Upvotes: 3
Reputation: 768
Your aggregation result exceeds the max size of mongodb.You can use allowDiskUse
option.This option prevent this.And in mongodb shell version 2.6
this will not throw an exception. look at this aggregrate.And you can optimize your query for decreasing the pipeline result.For this look at this question aggregation result
Upvotes: 0