Reputation: 85
So I'm running these following pbs files at the same time:
qsub /mnt/folder/prueba1_1 qsub /mnt/folder/prueba01
An here are the files
prueba1_1
#!/bin/bash
#PBS -N pruebaF
#PBS -V
#PBS -l nodes=1:ppn=1
#PBS -q batch
#PBS -j eo
cd /mnt/folder
mpiexec -f machinefile ./cpi2>>salida1_1.o
prueba01
#!/bin/bash
#PBS -N pruebaF
#PBS -V
#PBS -l nodes=1:ppn=1
#PBS -q batch
#PBS -j eo
cd /mnt/folder
mpiexec -f machinefile ./cpi2>>salida01.o
The file machinefile contains 2 nodes slave02 and slave03 each one with 1 processor
Although I specify that each pbs file should use just 1 node and 1 processor per job (with #PBS -l nodes=1:ppn=1) the output files seems to show that each job is using both nodes at the same time. I'm wondering why since these pbsfiles should use just one node and 1 processor, for me It should be that prueba1_1 should use slave02 with 1 processor and prueba01 should use slave02 as well but with the other processor.
the output files are here
salida1_1.o
Process 0 of 2 is on slave02
Process 1 of 2 is on slave03
pi is approximately 3.1415926535900915, Error is 0.0000000000002984
wall clock time = 14.937282
salida01.o
Process 0 of 2 is on slave02
Process 1 of 2 is on slave03
pi is approximately 3.1415926535900915, Error is 0.0000000000002984
wall clock time = 14.741892
Upvotes: 0
Views: 173
Reputation: 745
I would change machinefile to $PBS_NODEFILE. When Torque/PBS assigns nodes to your job it creates a file containing a list of those nodes and it sets the path to that file in the variable PBS_NODEFILE. I'm guessing machinefile was created for testing and since it is not created or updated by Torque that is why your jobs are always running the same way.
Upvotes: 1