Reputation: 123
I have 4 directories(name:1,2,3,4). Each one has an executable file of C code with name submit. Using #PBS -l select=2:ncpus=2
, gave me 4 workers(2 on node-1 and 2 on node-2).
Task: I need to run each 4 files on 4 different folders independently.
#PBS -l select=2:ncpus=2
./1/submit&
./2/submit&
./3/submit&
./4/submit&
Above forking method only chooses node-1 and forks all 4 jobs between 2 workers of node-1 and never goes to node-2.
#PBS -l select=2:ncpus=2
mpirun -np 1 -machinefile $PBS_NODEFILE ./1/submit&
mpirun -np 1 -machinefile $PBS_NODEFILE ./2/submit&
mpirun -np 1 -machinefile $PBS_NODEFILE ./3/submit&
mpirun -np 1 -machinefile $PBS_NODEFILE ./4/submit&
I tried using mpirun, but it still forks only between node-1 workers. Kindly suggest if there is any method to divide jobs between nodes.
Update's on the question after Ole Tange's answer
(1) Directory structure and it's contents are as follows:
ParentDirectory has PBS file "sub.sh" and sub-directories 1,2,3,4. Each sub directory has submit file which is an executable file compiled with icc compiler. submit file is a molecular dynamics executable code which generates files into the folder from where job is submitted.
(2) Running jobs on 1 node , 4 cores ==> 4 threads in total;
sub.sh has the contents,
#PBS -l select=1:ncpus=4
cd 1;./submit&
cd ../2;./submit&
cd ../3;./submit&
cd ../4;./submit&
sub.sh is submitted from the parent directory then it goes inside individual directories and creates threads for each folder. And hence the resulting files are generated inside each 1,2,3,4 directory without any interference from the other directories or threads. The resulting video looks like this which is correct
(3) Running jobs using gnu-parallel on 2 node , 2 cores==> 4 threads in total:
sub.sh has the contents,
#PBS -l select=2:ncpus=2
seq 4 | parallel --wd . -S 2/"$node1" -S 2/"$node2" ./exx
exx has the contents
cd 1;./submit&
cd ../2;./submit&
cd ../3;./submit&
cd ../4;./submit&
sub.sh is submitted from the parent directory. After I submitted sub.sh, I have seen that jobs are running on each folders 1,2,3,4 and generating files inside the directories, and the speed is comparable to serial code, which means that at least all 4 workers are working. But when I make the video of the results of 1 folder it looks strange, as you can see that the blue swimmer oscillates a lot, which I might be because of the race around condition , video
Surely something strange is going on in between the threads. I don't know.
Upvotes: 2
Views: 369
Reputation: 33685
Something like:
seq 4 | parallel --wd . -S 2/node1 -S 2/node2 ./{}/submit
Upvotes: 1