Three task parallel computation

Question

I have a loop that goes through tens of millions of cycles, each cycle corresponding to a row of data file I'm reading. There are three sequential computations inside the loop. Loosely speaking, we can label them (a) read data, (b) process data, (c) accumulate results. (a), (b), and (c) take about the same time individually. (b) depends on (a), and (c) depends on (a) and (b). I think that if I make the program run in 3 threads, with each thread behind by one computation from it's neighbor, I can get about a factor of 3 speedup. Unfortunately, I'm not familiar with multithreading.

The way I see the design is like this:

The first reads row n (a);
When this is done, the first thread processes the row (b), and at the same time the second thread reads row n+1;
When the second thread is done reading row n+1, it starts processing it, and the third thread reads row n+2. If the first thread is done with (b) it moves on the (c).

In other words, the sequence of steps is like this:

1a
1b 2a
1c 2b 3a
1a 2c 3b
1b 2a 3c
1c 2b 3a

and so on.

So, a single row always stays on the same thread. The next thread starts a new row when it's done with it's own and the other two threads have read the two preceding rows.

Can somebody help me set this up? These are the only constraints:

b_n can only start when a_n is done
c_n can only start when b_n is done
a_n can only start when a_(n-1) and a_(n-2) are done (since we have 3 threads, and it's faster to read sequentially)

I also understand that each thread will have to have independent storage.

Forgot to mention: each row is processed entirely independently.

Three task parallel computation

Answers (1)

Related Questions