Reputation: 141
I came across a strage problem, which worked but now it doesn't.
I run an OpenMPI program with tau profiling among 2 computers. It seems that mpirun can't run tau_exec program on a remote host, maybe it's a permission issue?
cluster@master:~/software/mpi_in_30_source/test2$ mpirun -np 2 --hostfile hostfile -d tau_exec -v -T MPI,TRACE,PROFILE ./hello.exe
[master:19319] procdir: /tmp/openmpi-sessions-cluster@master_0/4568/0/0
[master:19319] jobdir: /tmp/openmpi-sessions-cluster@master_0/4568/0
[master:19319] top: openmpi-sessions-cluster@master_0
[master:19319] tmp: /tmp
[slave2:06777] procdir: /tmp/openmpi-sessions-cluster@slave2_0/4568/0/1
[slave2:06777] jobdir: /tmp/openmpi-sessions-cluster@slave2_0/4568/0
[slave2:06777] top: openmpi-sessions-cluster@slave2_0
[slave2:06777] tmp: /tmp
[master:19319] [[4568,0],0] node[0].name master daemon 0 arch ff000200
[master:19319] [[4568,0],0] node[1].name slave2 daemon 1 arch ff000200
[slave2:06777] [[4568,0],1] node[0].name master daemon 0 arch ff000200
[slave2:06777] [[4568,0],1] node[1].name slave2 daemon 1 arch ff000200
[master:19319] Info: Setting up debugger process table for applications
MPIR_being_debugged = 0
MPIR_debug_state = 1
MPIR_partial_attach_ok = 1
MPIR_i_am_starter = 0
MPIR_proctable_size = 2
MPIR_proctable:
(i, host, exe, pid) = (0, master, /home/cluster/software/mpi_in_30_source/test2/tau_exec, 19321)
(i, host, exe, pid) = (1, slave2, /home/cluster/software/mpi_in_30_source/test2/tau_exec, 0)
--------------------------------------------------------------------------
mpirun was unable to launch the specified application as it could not find an executable:
Executable: tau_exec
Node: slave2
while attempting to start process rank 1.
--------------------------------------------------------------------------
[slave2:06777] sess_dir_finalize: job session dir not empty - leaving
[slave2:06777] sess_dir_finalize: job session dir not empty - leaving
[master:19319] sess_dir_finalize: job session dir not empty - leaving
[master:19319] sess_dir_finalize: proc session dir not empty - leaving
orterun: exiting with status -123
On slave2:
cluster@slave2:~/software/mpi_in_30_source/test2$ tau_exec -T MPI,TRACE,PROFILE ./hello.exe
hello MPI user: from process = 0 on machine=slave2, of NCPU=1 processes
cluster@slave2:~/software/mpi_in_30_source/test2$ which tau_exec
/home/cluster/tools/tau-2.22.2/arm_linux/bin/tau_exec
So there is a working tau_exec on both nodes. When I run mpirun without tau_exec everything works.
cluster@master:~/software/mpi_in_30_source/test2$ mpirun -np 2 --hostfile hostfile ./hello.exe
hello MPI user: from process = 0 on machine=master, of NCPU=2 processes
hello MPI user: from process = 1 on machine=slave2, of NCPU=2 processes
Upvotes: 2
Views: 38683
Reputation: 4127
If you're running a shell script with mpirun, make sure you've chmod +x script_file.sh
else you'll see this error.
Upvotes: 2
Reputation: 159
Maybe is because you already had installed openMPI and not only MPICH2, so you should run the below commands as root:
root~# update-alternatives --config mpirun
There are 2 choices for the alternative mpirun (providing /usr/bin/mpirun).
Selection | Path | Priority | Status
Press enter to keep the current choice[*], or type selection number: 1
Then you should select the MPICH version, as above, to run normally.
Upvotes: 2
Reputation: 153
once had an error like this when i tried to name the output file just try leave it the same
mpirun -n <number> a.out
that is how it worked for me!
Upvotes: 2
Reputation: 9062
Try putting the full path to tau_exec
in your command line. It's possible that you PATH isn't the same on all of the nodes. If that's the case, it wouldn't be able to find the executable anywhere where the path isn't correct.
It's most likely not a permission issue, but I don't remember all of the error messages in Open MPI to tell you how helpful they might be.
Upvotes: 2