darshan
darshan

Reputation: 1230

Alternatives to mincemeatpy and octopy

Briefly, octopy and mincemeatpy are python implementations of map-reduce (light-weight), and clients can join the cluster in ad-hoc manner without requiring any installations (Of-course, except python). Here are the project details OCTOPY and Mincemeatpy.

The problem with these is they need to hold the entire data in-memory (including intermediate key-value pairs). So even for a moderate size data, they throw out of memory exceptions.

The key-reasons I'm using them are:

  1. Python.
  2. No cluster installation required.
  3. I just prototype, and I can directly port the algorithm once I'm ready.

So my question is: Is there any package which handles the same stuff, but not just in-memory (which can handle moderate size data) ?

Upvotes: 4

Views: 678

Answers (1)

floatdrop
floatdrop

Reputation: 612

Try PyMapReduce. It runs on your own machine, but on several processes - so you don't need to build up master-node architecture and it have plenty of runners, for example DiskBasedRunner, which seems to store map data to temp files and after reduces them.

Upvotes: 3

Related Questions