Yin Zhu
Yin Zhu

Reputation: 17119

Any python library for parallel and distributed tasks?

I am looking for a Python library that can distribute the tasks across a few servers. The task would be similar to what can be parallelized by the subprocess library in a single machine.

I know that I can setup a Hadoop system for such purposes. However Hadoop is heavy weight. In my case, I would like to use a shared network disk for data I/O, and I don't need any fancy failure recover. In MapReduce's terminology, I only need mappers, no aggregators or reducers.

Any such library in Python? Thanks!

Upvotes: 1

Views: 194

Answers (1)

PaulMcG
PaulMcG

Reputation: 63762

Try using celery.

Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well.

The execution units, called tasks, are executed concurrently on a single or more worker servers using multiprocessing, Eventlet, or gevent. Tasks can execute asynchronously (in the background) or synchronously (wait until ready).

Upvotes: 3

Related Questions