SolessChong
SolessChong

Reputation: 3407

How to `parfor` nested loops?

The original code is like this:

for i = 1 : size(H, 1)
    for j = 1 : size(H, 2)
        H{i,j} blabla

and I tried to adapt it into parallel code like this:

parfor ind = 1 : numel(H)
    [i, j] = ind2sub(ind);
    H{i,j} blabla

which generates an error saying parfor cannot run due to H{i,j}.

Then what's the error here? And how can I adapt the nested loop into parfor?

One possible solution is

for i = 1 : size(H, 1)
    parfor j = 1 : size(H, 2)
        H{i,j} blabla

But I doubt using a parfor within another loop will multiply the overhead of parfor which results in additional computation time.

Upvotes: 1

Views: 1405

Answers (2)

Bentoy13
Bentoy13

Reputation: 4974

I think the error for using parfor is that Matlab is unable to detect that [i,j] is unique through the loop because it is the result of a function. Thus, for the engine, you may access to H{i,j} multiple times, iterations are not analyzed to be independent from each other.

Edit: as mentioned by patrik, you have to be sure that there is no dependence between two iterations, that is here H{i,j} does not depend on H{k,l}, i!=k and j!=l, nor the value of a variable in the iteration is used in another iteration. This requirement is the basic one to allow a parfor, except from reduction assignment.

Besides that point, if you want to run independent computations in parallel, and if it worth it, always choose to parfor the outermost loop. In addition to this, remind that Matlab does not allow nested parfor; instead, you have to make a function which runs a parfor if you want to parallelize inner for-loops. The parallelization of inner loops may not bring a speed-up (depends on how many workers are there in the parpool).

From my experience, it is not recommended to run parallel inner loops. As an example (outside Matlab), I would cite LibSVM, which recommends to parallelize only the outermost loop with openmp if you want to speed-up the computation, never other inner loops.

The reason of this recommendation is that you have a limited pool of workers, and workers may be viewed as threads; there is a limit where if you add threads, the computation run slower because of the time of switching between threads. Matlab may manage this part very well, but the point is that you will have a pool of workers limited in size. If each outermost iteration takes a lot of time and if you have many iteration, you will gain no time to parallelize inner loops because each worker will be busy to run the whole iteration (including inner loops).

Nevertheless, it's always a good thing to test each option, some of them may be counter-intuitively more adapted to your problem!

Upvotes: 2

Edric
Edric

Reputation: 25160

Why not simply use the linear index to assign into H? For example:

H = cell(4, 4);
parfor idx = 1:16
  [i, j] = ind2sub([4, 4], idx);
  H{idx} = rand(i, j); % or whatever
end

Otherwise, it's always best to make the outermost loop the PARFOR loop. The following also works:

H = cell(4, 4);
parfor r = 1:4
  for c = 1:4
    H{r, c} = rand(r, c);
  end
end

Upvotes: 1

Related Questions