Reputation: 442
I get the following error: "The variable X_bs in a parfor cannot be classified" when trying to run the following parfor loop:
y = zeros(1000,1)
parfor bb = 1:1000
rng(bb)
% deleted line: X_bs{8} = [];
for ii = 1:8
ind = ceil(N(ii)*rand(N(ii),1));
X_bs{ii} = X{ii}(ind,:);
end
y(bb) = another_function(X_bs)
end
X
is a 1x8 cell array, each cell containing a N(ii)x4
matrix (N(ii)
changes). The code re-samples rows of X
and creates a cell array called X_bs
. I pass X_bs
through a function that outputs the variable that I am interested in, y(bb)
.
Why am I getting this error? How do I get around this?
Here is an example:
X{1} = [1 ; 2 ; 3]
X{2} = [4 ; 5 ; 6; 7]
N(1) = 3 % size of X{1}
N(2) = 4 % size of X{2}
parfor bb=1:10
rng(bb)
for ii = 1:2
X_bs{ii} = zeros(N(ii),1);
ind = ceil(N(ii)*rand(N(ii),1));
X_bs{ii} = X{ii}(ind,:);
end
% Output is a function of X_bs. For illustration, say it is the sum
y(bb) = sum(X_bs{1}) + sum(X_bs{2});
end
The above code gets the same error as earlier: "The variable X_bs in a parfor cannot be classified". Here is a simpler version that does work and does not use cell structures:
X = [1 ; 2 ; 3]
N = 3; %size of X
parfor bb=1:10
rng(bb)
X_bs = zeros(N,1);
ind = ceil(N*rand(N,1));
X_bs= X(ind,:);
y(bb) = sum(X_bs)
end
The problem (I think) lies within overriding the cell structure. Perhaps parfor treats cellstructures as a slicing variables rather than temporary variables. Any thoughts?
Update: Adriaan suggested that there is a problem with the random vector ind
and the cell structure X_bs{ii}
. Here is a simpler example that does not invoke the random vector ind and still has the same error:
X{1} = [1 ; 2 ; 3]
X{2} = [4 ; 5 ; 6; 7]
N(1) = 3 % size of X{1}
N(2) = 4 % size of X{2}
parfor bb=1:10
for ii = 1:2
X_bs{ii} = X{ii};
end
% Output is a function of X_bs. For illustration, say it is the sum
y(bb) = sum(X_bs{1}) + sum(X_bs{2});
end
Therefore, I am fairly certain MatLab isn't treating the cell structure, X_bs{ii}
, as a temporary variable.
Upvotes: 1
Views: 793
Reputation: 442
A solution (thanks to Andriaan's comment) is to put the forloop that generates X_bs
in a nested function, randomize_X
. See below for a solution to the working example:
X{1} = [1 ; 2 ; 3]
X{2} = [4 ; 5 ; 6; 7]
N(1) = 3 % size of X{1}
N(2) = 4 % size of X{2}
parfor bb=1:10
[X_bs] = randomize_X(X,N)
% Output is a function of X_bs. For illustration, say it is the sum
y(bb) = sum(X_bs{1}) + sum(X_bs{2});
end
where
function [X_bs] = randomize_X(X,N)
for ii = 1:2
X_bs{ii} = zeros(N(ii),1);
ind = ceil(N(ii)*rand(N(ii),1));
X_bs{ii} = X{ii}(ind,:);
end
end
I'm fairly certain that the parfor
loop treats cell structures as slicing variables, so it needs to be specified beforehand and cannot be overwritten within the parfor
loop. To get around this, a simple trick is send the computation of cell structures to another function. The parfoor
loop has similar problems with the optimization toolbox CVX. The same trick can be employed - call CVX in a subroutine within parfor
.
Upvotes: 0
Reputation: 18177
The problem is created on the line:
ind = ceil(N(ii)*rand(N(ii),1));
and surfaces here:
X_bs{ii} = X{ii}(ind,:);
The size of ind
is "unknown" to MATLAB before executing the parfor
and hence cannot run. parfor
does not run in a consecutive order, making it paramount that all required sizes are specified beforehand. Probably you can circumvent this using clear X_bs
just before the invocation of y(bb)
, this removed them from memory and makes sure the contained matrices are not of different size than the ones you assigned the previous iteration.
If on the other hand this does not solve it, you can't parallelise this, since you are using random sized matrices each time. To expand on this: rand(N(ii))
can produce quite a range of numbers. Regardless of that range, the important fact is that the range is indeterminable, meaning you cannot know what it will be before executing it. Even though this is normally no problem, since the cell will store any old matrix size, the parfor
environment requires to know all sizes before hand, in order to optimise memory and CPU usage before assigning jobs to workers.
If possible: get the maximum value ind
can attain and initialise the required matrix in a temporary variable to be the maximum allowable size and access only the used entries.
Upvotes: 1