Reputation: 10764
The aggregation framework on MongoDB has certain limitations as per this link.
I want to remove the restrictions 2, 3.
I really do not care what the resulting set's size is. I have a lot of RAM and resources.
And I do not care if it takes more than 10% system resources.
I expect both 2, 3 to be violated in my application. Mostly 2.
But I really need the aggregation framework. Is there anything that I can do to remove these limitations?
The application I have been working has these things
Choosing something other than mongodb is a no go. We have already sunk too much into development with MongoDB
Is it advisable to change the source code of Mongo?
Upvotes: 0
Views: 2275
Reputation: 5757
1) Saving aggregated values directly to some collection(like with MapReduce) will released in future versions, so first solution is just wait for a while :)
2) If you hit 2-nd or 3-rd limitation may you should redesign your data scheme and/or aggregation pipeline. If you working with large time series, you can reduce number of aggregated docs and do aggregation in several steps (like MapReduce do). I can't say more concretely, because I don't know your data/use cases(give me a comment).
3) You can choose different framework. If you familiar with MapReduce concept, you can try Hadoop(it can use MongoDB as data source). I don't have experience with MongoDB-Hadoop integration, but I mast warn you NOT to use Mongo's MapReduce -- it sucks hard on large datasets.
4) You can do aggregation inside your code, but you should use some "lowlevel" language or library. For example, pymongo (http://api.mongodb.org/python/current/) is not suitable for such things, but you can tray something like monary(https://bitbucket.org/djcbeach/monary/wiki/Home) to efficiently extract date and NumPy or Pandas to aggregate it the way want.
Upvotes: 2