Post169
Post169

Reputation: 708

Matlab - batch jobs won't leave queued status

I've got some code that, as it iterates through a loop, grows by some percent in what is to be processed each time. The first few iterations take 4 seconds, but by the 100th, they're taking minutes - and this is for a lite selection of parameters, as I intend to do 350 iterations. To do serious research with this would take enormous time, and it's really inconvenient that simply running a script ties Matlab's hands behind its back until it's all done, and on top of that it hardly ever uses more than one core at a time.

I understand that turning on a Parallel Pool will enable parallel processing. Even if I can't convert any of the for loops into parfor loops, I understand that running a script as a batch job sends that process into the background, and I can do other things with the Matlab interface and the other 7 processors while I wait for this one to finish.

However, though I have the local parallel pool up and running, and I've checked the syntax for starting a batch job, it's not leaving the "queued" status. The first time I typed in batch('Script4') and hit Enter, and then realized I must have a variable name for the job, so then I did run1 = batch('Script4'). I typed get(run1,'State'), and also checked the Job Monitor, and both told me that its state was "queued".

I did some googling before I came here, and while I found some Q&As of similar experiences, they seemed to be solved by things like waiting for the pool to stop using the whole CPU as it starts up. But I started my pool up a long time ago (and it is still running at this moment!), and when I entered the first batch command, my first clue that something was wrong was that Windows Task Manager said all 8 cores were at 0%.

Is there something I need to call or maybe adjust before it will start executing the queued jobs?

I'm using Matlab R2015a on Windows 7 Enterprise.

Upvotes: 0

Views: 1471

Answers (1)

Edric
Edric

Reputation: 25140

I think the problem here is that you're trying to run batch jobs while the parallel pool is open. (Unfortunately, this is a common misunderstanding). Basically, the parallel pool and your batch job are both trying to consume local workers. However, because you opened the parallel pool first, it's consuming all the local workers, and the batch job cannot proceed. You should have seen a warning when you submit the batch job, like this:

>> parpool('local');
Starting parallel pool (parpool) using the 'local' profile ... connected to 4 workers.
>> j = batch(@rand, 1, {});
Warning: This job will remain queued until the Parallel Pool is closed. 

There are two possible fixes - the first is simple

delete(gcp('nocreate'))

will ensure no parallel pool is open, and your batch submissions should proceed. The second is more appropriate if your tasks are relatively short-lived - you can use parfeval to submit work to an open parallel pool:

f = parfeval(@rand, 1); % initiate 'rand' on the parallel pool workers
fetchOutputs(f); % wait for completion, and retrieve the result

Upvotes: 1

Related Questions