Reputation: 1230
Briefly, octopy and mincemeatpy are python implementations of map-reduce (light-weight), and clients can join the cluster in ad-hoc manner without requiring any installations (Of-course, except python). Here are the project details OCTOPY and Mincemeatpy.
The problem with these is they need to hold the entire data in-memory (including intermediate key-value pairs). So even for a moderate size data, they throw out of memory exceptions.
The key-reasons I'm using them are:
So my question is: Is there any package which handles the same stuff, but not just in-memory (which can handle moderate size data) ?
Upvotes: 4
Views: 678
Reputation: 612
Try PyMapReduce. It runs on your own machine, but on several processes - so you don't need to build up master-node architecture and it have plenty of runners, for example DiskBasedRunner, which seems to store map data to temp files and after reduces them.
Upvotes: 3