phynfo
phynfo

Reputation: 4938

Running Bulk Synchronous Parallel Model (BSP) in Python

The BSP Parallel Programming Model has several benefits - the programmer need not explicitly care about synchronization, deadlocks become impossible and reasoning about speed becomes much easier than with traditional methods. There is a Python interface to the BSPlib in the SciPy:

import Scientific.BSP

I wrote a little program to test BSP. The Program is a simple random experiment which "calculates" the probalbility that throwing n dice yields a sum of k:

from Scientific.BSP import ParSequence, ParFunction, ParRootFunction
from sys import argv
from random import randint
n = int(argv[1]) ; m = int(argv[2]) ; k = int(argv[3])

def sumWuerfe(ws): return len([w for w in ws if sum(w)==k])
glb_sumWuerfe= ParFunction(sumWuerfe)
def ausgabe(result): print float(result)/len(wuerfe)
glb_ausgabe = ParRootFunction(output)
wuerfe = [[randint(1,6) for _ in range(n)] for _ in range(m)]
glb_wuerfe = ParSequence(wuerfe)

# The parallel calc:
ergs = glb_sumWuerfe(glb_wuerfe)
# collecting the results in Processor 0:
ergsGesamt= results.reduce(lambda x,y:x+y, 0)

glb_output(ergsGesamt)

The program works fine, but: It uses just one process!

My Question: Anyone knows how to tell this Pythonb-BSP-Script to use 4 (or 8 or 16) Processes? I thought this BSP Implementation woould use MPI, but starting the script via mpiexe -n 4 randExp.py doesnt work.

Upvotes: 3

Views: 1132

Answers (1)

Jonathan Dursi
Jonathan Dursi

Reputation: 50927

A minor thing, but Scientific Python != SciPy in your question...

If you download the ScientificPython sources you'll see a README.BSP, a README.MPI, and a README.BSPlib. Unfortunately, there's not really much mention made of the information there on the online webpages.

The README.BSP is pretty explicit about what you need to do to get the BSP stuff working in real Parallel:

In order to use the module Scientific.BSP using more than one real processor, you must compile either the BSPlib or the MPI interface. See README.BSPlib and README.MPI for installation details. The BSPlib interface is probably more efficient (I haven't done extensive tests yet), and allows the use of the BSP toolset, on the other hand MPI is more widely available and might thus already be installed on your machine. For serious use, you should probably install both and make comparisons for your own applications. Application programs do not have to be modified to switch between MPI and BSPlib, only the method to run the program on a multiprocessor machine must be adapted.

To execute a program in parallel mode, use the mpipython or bsppython executable. The manual for your MPI or BSPlib installation will tell you how to define the number of processors.

and the README.MPI tells you what to do to get MPI support:

Here is what you have to do to get MPI support in Scientific Python:

1) Build and install Scientific Python as usual (i.e. "python setup.py install" in most cases).

2) Go to the directory Src/MPI.

3) Type "python compile.py".

4) Move the resulting executable "mpipython" to a directory on your
system's execution path.

So you have to build more BSP stuff explicitly to take advantage of real parallelism. The good news is you shouldn't have to change your program. The reason for this is that different systems have different parallel libraries installed, and libraries that go on top of those have to have a configuration/build step like this to take advantage of whatever is available.

Upvotes: 3

Related Questions