Parallel Algorithm implementation advice

Question

For a graduate assignment I need to implement 4 parallel algorithms. I am complete beginner to parallel algorithms so I do not know what to study, which technology should I use(Threads, MPI, OpenMP, ...) etc.

For clarity below is a Pascal-like pseudocode for one of the algorithms.

procedure BROADCAST(D,N,A)
  Step 1: Processor P1
          (i) reads the value in D,
         (ii) stores it in its own memory, and
        (iii) writes it in A(1)

  Step 2: for i = 0 to (logN - 1) do
             for j = 2^i + 1 to 2^(i+1) do in parallel
                Processor Pj
                   (i) reads the value in A(j - 2^i)
                  (ii) stores it in its own memory, and
                 (iii) writes it in A(j)
             end for
           end for

D is datum to be distributed among processors.
N is the processor count.
A is an array of length N in shared memory.

AVH · Accepted Answer

Here's a very concise overview of the three "methods" you suggested:

Threads: you get a lot of control over what your program does, but it comes with the cost that you need to handle all of the bookkeeping yourself (mutexes, semaphores, make sure there's no deadlocks, ...). This is a good choice when you have multiple threads that need to communicate with one another. However, in your case, all your threads will basically be doing the same thing. Also, debugging a multi-threaded program is very hard and just when you think you found all the bugs, another deadlock rears its ugly head.
MPI: this is a library to pass messages between different processes. As such, it's not really good for what you're trying to do here. It would be, however, in case you wanted to parallelize your program on a computer cluster. Then data would need to be passed between processes on different computers. That's where MPI shines.
OpenMP: a very convenient library that does all the parallelizing for you (of course you have to give it some info, but still, no need to worry about deadlocks and the like). No need to do any bookkeeping, the library does all this for you. What's particularly nice about OpenMP is that it works with pragmas. You code your program sequentially, test it, then add a few #pragma lines, compile with the -fopenmp flag (or whatever flag your particular compiler wants) and there you go: parallellized program. When you leave out the necessary compiler flag, the pragmas will be ignored, making this a very portable solution as well. This is great for programs where you want a speedup by parallelizing algorithms, but don't really need to pass around data between independent threads.

What I would do is use OpenMP, because it's very, very easy to use, allows you to enable/disable the multi-threading without any changes to your code, does almost everything for you and is, with some luck, already included into your libraries (at least for GCC this is the case). Here is a quite extensive tutorial: https://computing.llnl.gov/tutorials/openMP. You can find tons more on Google, the internet is full of them :).

Edit: Of course, nothing prevents you from using these three methods at the same time. If you want to create some multi-threaded, distributed application which has algorithm implementations that can easily be parallelized, that's exactly what you would do.

Parallel Algorithm implementation advice

Answers (1)

Related Questions