RyanStochastic
RyanStochastic

Reputation: 4083

Can't find function on unix cluster: matlab parallel: undefined function

I am trying to run a parallel job on a large cluster; all code and functions work fine on my computer (Mac OS 10.7 & Matlab v7.13.0.564 (R2011b)) but something is not working on the cluster (unix-redhat 5.5 - kernel = 2.6.18-238.12.1.el5 Matlab v7.13.0.564 (R2011b)

The following is a sequence of commands that works without error on my machine and fails on the cluster. The function add2nums is in the subdirectory ./lib, and is also documented below.

function out = add2nums(num1, num2)
    out = num1+num2;
end

What is going on and how do I fix it?

>> addpath('./lib')

>> which add2nums

/My_Matlab_Path/lib/add2nums.m (matlab sees the function on my machine and on unix cluster)

>> sched = findResource('scheduler','type','local');

>> job = createParallelJob(sched,'configuration','local');

>> task = createTask(job,@add2nums,1,{[1 2],[3 4]},'CaptureCommandWindowOutput',true);   

>> addpath('./lib')

>> submit(job)

>> waitForState(job)

>> task

task =

Task ID 1 from Job ID 19 Information
====================================

                     State : finished
                  Function : @add2nums
                 StartTime : Tue Aug 07 10:27:44 MDT 2012
          Running Duration : 0 days 0h 0m 1s

- Task Result Properties

           ErrorIdentifier : MATLAB:UndefinedFunction
              ErrorMessage : Undefined function 'add2nums' for input arguments of type 'double'.

Upvotes: 1

Views: 756

Answers (2)

RyanStochastic
RyanStochastic

Reputation: 4083

Found a solution; in this case it may be specific to the unix system architecture I'm working on. I submit jobs from one filesystem, and they run in a separate (temporary) directory - on a separate filesystem. I think that these are in fact different pieces of hardware in different buildings, but I'm not positive.

The resolution is to add the line:

p = 'PATH_TO_LOCAL_DIRECTORY_OF_FUNCTIONS';

set(job,'FileDependencies',{p});

I had tried setting the FileDependencies property before - but I previously used a path that for whatever reason the matlab worker couldn't use. The solution for me was copying my entire directory of subfunctions to the remote (temporary) directory where the unix job is running; this remote directory is then local to- and visible to the matlab workers, as long as the 'FileDependencies' property points to it.

Upvotes: 1

Amro
Amro

Reputation: 124563

Reading the documentation, it seems you need to setup the file and path dependencies.

See the FileDependencies and PathDependencies properties of the parallel job object.

Upvotes: 0

Related Questions