Girardi
Girardi

Reputation: 2811

Why is MATLAB job taking a long time running?

I have a function (a convolution) which can get very slow if it operates on matrices of many many columns (function code below). I hence want to parallelize the code.

Example MATLAB code:

x = zeros(1,100);
x(rand(1,100)>0.8) = 1;
x = x(:);
c = convContinuous(1:100,x,@(t,p)p(1)*exp(-(t-p(2)).*(t-p(2))./(2*p(3).*p(3))),[1,0,3],false)
plot(1:100,x,1:100,c)

if x is a matrix of many columns, the code gets very slow... My first attempt was to change for to parfor statement, but it went wrong (see Concluding remarks below).

My second attempt was to follow this example, which shows how to schedule tasks in a job and then submit the job to a local server. That example is implemented in my function below by letting the last argument isParallel being true.

The example MATLAB code would be:

x = zeros(1,100);
x(rand(1,100)>0.8) = 1;
x = x(:);
c = convContinuous(1:100,x,@(t,p)p(1)*exp(-(t-p(2)).*(t-p(2))./(2*p(3).*p(3))),[1,0,3],true)

Now, MATLAB tells me:

Starting parallel pool (parpool) using the 'local' profile ... connected to 4 workers.

Warning: This job will remain queued until the Parallel Pool is closed.

And MATLAB terminal keeps on hold, waiting for something to finish. I then open Jobs Monitor by Home -> Parallel -> Monitor jobs and see there are two jobs, one of which has the state running. But none of them will ever finish.

Questions

File convContinuous.m

function res = convContinuous(tData, sData, smoothFun, par, isParallel)
% performs the convolution of a series of delta with a smooth function of parameters par
% tData = temporal space
% sData = matrix of delta series (each column is a different series that will be convolved with smoothFunc)
% smoothFun = function used to convolve with each column of sData
%             must be of the form smoothFun(t, par)
% par = parameters to smoothing function
    if nargin < 5 || isempty(isParallel)
        isParallel = false;
    end
    if isvector(sData)
        [mm,nn] = size(sData);
        sData = sData(:);
    end
    res = zeros(size(sData));
    [ ~, n ] = size(sData);
    if ~isParallel
        %parfor i = 1:n % uncomment this and comment line below for strange error
        for i = 1:n
            res(:,i) = convolveSeries(tData, sData(:,i), smoothFun, par);
        end
    else
        myPool = gcp; % creates parallel pool if needed
        sched = parcluster; % creates scheduler
        job = createJob(sched);
        task = cell(1,n);
        for i = 1:n
            task{i} = createTask(job, @convolveSeries, 1, {tData, sData(:,i), smoothFun, par});
        end
        submit(job);
        wait(job);
        jobRes = fetchOutputs(job);
        for i = 1:n
            res(:,i) = jobRes{i,1}(:);
        end
        delete(job);
    end
    if isvector(sData)
        res = reshape(res, mm, nn);
    end
end

function r = convolveSeries(tData, s, smoothFun, par)
    r = zeros(size(s));
    tSpk = s == 1;
    j = 1;
    for t = tData
        for tt = tData(tSpk)
            if (tt > t)
                break;
            end
            r(j) = r(j) + smoothFun(t - tt, par);
        end
        j = j + 1;
    end
end

Concluding remarks

As a side note, I was not able to do it using parfor because MATLAB R2015a gave me a strange error:

Error using matlabpool (line 27) matlabpool has been removed.

To query the size of an already started parallel pool, query the 'NumWorkers' property of the pool.

To check if a pool is already started use 'isempty(gcp('nocreate'))'.

Error in parallel_function (line 317) Nworkers = matlabpool('size');

Error in convContinuous (line 18) parfor i = 1:n

My version command outputs

Parallel Computing Toolbox                            Version 6.6        (R2015a)

which is compatible with my MATLAB version. Almost all other tests I have done are OK. I am then compelled to think that this is a MATLAB bug.

I tried changing matlabpool to gcp and then retrieving the number of workers by parPoolObj.NumWorkers, and after altering this detail in two different built-in functions, I received another error:

Error in convContinuous>makeF%1/F% (line 1)

function res = convContinuous(tData, sData, smoothFun, par)

Output argument "res" (and maybe others) not assigned during call to "convContinuous>makeF%1/F%".

Error in parallel_function>iParFun (line 383) output.data = processInfo.fun(input.base, input.limit, input.data);

Error in parProcess (line 167) data = processFunc(processInfo, data);

Error in parallel_function (line 358) stateInfo = parProcess(@iParFun, @iConsume, @iSupply, ...

Error in convContinuous (line 14) parfor i = 1:numel(sData(1,:))

I suspect that this last error is generated because the function call inside parfor loop requires many arguments, but I don't really know it.

Solving the errors

Thanks to wary comments of people here (saying they could not reproduce my errors), I went on looking for the source of the error. I realized it was a local error due to having pforfun in my pathdef.m which I downloaded long ago from File Exchange.

Once I removed pforfun from my pathdef.m, parfor (line 18 in convContinuous function) started working well.

Thank you in advance!

Upvotes: 0

Views: 1267

Answers (2)

Girardi
Girardi

Reputation: 2811

Thanks to wary comments of people here (saying they could not reproduce my errors), I went on looking for the source of the error. I realized it was a local error due to having pforfun in my pathdef.m which I downloaded long ago from File Exchange.

Once I removed pforfun from my pathdef.m, parfor (line 18 in convContinuous function) started working well.

Upvotes: 0

Thomas Ibbotson
Thomas Ibbotson

Reputation: 745

The parallel pool you created is blocking your job from running. When you are using the jobs and tasks API you do not need (and must not have) a pool open. When you looked in Job Monitor, the running job you saw was the job that backs the parallel pool, that only finishes when the pool is deleted.

If you delete the line in convContinuous that says myPool = gcp, then it should work. As an optimization you can use the vectorised form of createTask, which is much more efficient than creating tasks in a loop i.e.

inputCell = cell(1, n);
for i = 1:n
    inputCell{i} = {tData, sData(:,i), smoothFun, par};
end
task = createTask(job, @convolveSeries, 1, inputCell);

However, having said all that, you should be able to make this code work using parfor. The first error you encountered was due to matlabpool being removed, it has now been replaced by parpool.

The second error appears to be caused by your function not returning the correct outputs, but the error message does not appear to correspond to the code you posted, so I'm not sure. Specifically I don't know what convContinuous>makeF%1/F% (line 1) refers to.

Upvotes: 3

Related Questions