Reputation: 443
I have seen this asked a couple of years ago. Since then MongoDB 2.4 has multi-threaded Map Reduce available (after the switch to the V8 Javascript engine) and has become faster than what it was in previous versions and so the argument of being slow is not an issue.
However, I am looking for a scenario where a Map Reduce approach might work better than the Aggregation Framework. Infact, possibly a scenario where the Aggregation Framework cannot work at all but the Map Reduce can get the required results.
Thanks, John
Upvotes: 0
Views: 243
Reputation: 59763
Currently the Aggregation Framework results can't exceed 16MB. But, I think more importantly, you'll find that the AF is better suited to "here and now" type queries that are dynamic in nature (like filters are provided at run-time by the user for example).
A MapReduce is preplanned and can be far more complex and produce very large outputs (as they just output
to a new collection). It has no run-time inputs that you can control. You can add complex object manipulation that simply is not possible (or efficient) with the AF. It's simple to manipulate child arrays (or things that are array like) for example in MapReduce as you're just writing JavaScript, whereas in the AF, things can become very unwieldy and unmanageable.
The biggest issue is that MapReduce's aren't automatically kept up to date and they're difficult to predict when they'll complete). You'll need to implement your own solution to keeping them up to date (unlike some other NoSQL options). Usually, that's just a timestamp of some sort and an incremental MapReduce update as shown here). You'll possibly need to accept that the data may be somewhat stale and that they'll take an unknown length of time to complete.
If you hunt around on StackOverflow, you'll find lots of very creative solutions to solving problems with MongoDB and many solutions use the Aggregation Framework as they're working around limitations of the general query engine in MongoDB and can produce "live/immediate" results. (Some AF pipelines are extremely complex though which may be a concern depending on the developers/team/product).
Upvotes: 1
Reputation: 146
Take a look to this.
The Aggregation FW results are stored in a single document so are limited to 16 MB: this might be not suitable for some scenarios. With MapReduce there are several output types available including a new entire collection so it doesn't have space limits.
Generally, MapReduce is better when you have to work with large data sets (may be the entire collection). Furthermore, it gives much more flexibility (you write your own aggregation logic) instead of being restricted to some pipeline commands.
Upvotes: 1