jiyin
jiyin

Reputation: 85

Torque does not limit the number of nodes mpiexec uses

So I'm running these following pbs files at the same time:

qsub /mnt/folder/prueba1_1 qsub /mnt/folder/prueba01

An here are the files

prueba1_1

#!/bin/bash
#PBS -N pruebaF
#PBS -V
#PBS -l nodes=1:ppn=1
#PBS -q batch
#PBS -j eo
cd /mnt/folder
mpiexec -f machinefile  ./cpi2>>salida1_1.o

prueba01

#!/bin/bash
#PBS -N pruebaF
#PBS -V
#PBS -l nodes=1:ppn=1
#PBS -q batch
#PBS -j eo
cd /mnt/folder
mpiexec -f machinefile  ./cpi2>>salida01.o

The file machinefile contains 2 nodes slave02 and slave03 each one with 1 processor

Although I specify that each pbs file should use just 1 node and 1 processor per job (with #PBS -l nodes=1:ppn=1) the output files seems to show that each job is using both nodes at the same time. I'm wondering why since these pbsfiles should use just one node and 1 processor, for me It should be that prueba1_1 should use slave02 with 1 processor and prueba01 should use slave02 as well but with the other processor.

the output files are here

salida1_1.o

Process 0 of 2 is on slave02
Process 1 of 2 is on slave03
pi is approximately 3.1415926535900915, Error is 0.0000000000002984
wall clock time = 14.937282

salida01.o

Process 0 of 2 is on slave02
Process 1 of 2 is on slave03
pi is approximately 3.1415926535900915, Error is 0.0000000000002984
wall clock time = 14.741892

Upvotes: 0

Views: 173

Answers (1)

chuck
chuck

Reputation: 745

I would change machinefile to $PBS_NODEFILE. When Torque/PBS assigns nodes to your job it creates a file containing a list of those nodes and it sets the path to that file in the variable PBS_NODEFILE. I'm guessing machinefile was created for testing and since it is not created or updated by Torque that is why your jobs are always running the same way.

Upvotes: 1

Related Questions