Reputation: 31
I am trying to profile a MPI+OPENACC program with nsys. I am using OpenMPI(3.1.6) from Nvidia HPC SDK(20.7) with UCX enabled. There are three exectuables, exec1, exec2, exec3. I want to profile for exec3. But I am failing. Following is the run script:-
#SBATCH --nodes=1
#SBATCH --ntasks=40
#SBATCH --ntasks-per-node=40
#SBATCH --output=app.out
#SBATCH --error=app.err
#SBATCH -p Intel_6248_2s_20c_2t_GPU_hdr100_192GB_2933
#SBATCH --exclusive
#SBATCH --gres=gpu:4
WRAPPER=/run/acc_round_robin.sh
exec1=$workdir/exec/prog1
exec2=$workdir/exec/prog2
exec3=$workdir/exec/prog3
echo "0 $WRAPPER $exec1> $workdir/file.conf
echo "2-9,11-19,21-29,32-39 $WRAPPER $exec2">> $workdir/file.conf
echo "nsys profile 1,10,20,30,31 $WRAPPER $exec3">> $workdir/file.conf
echo "#!/bin/bash" > $workdir/file1_cmd
echo "srun --multi-prog $workdir/file.conf" >> $workdir/file1_cmd
echo "exit 1" >> $workdir/file1_cmd
chmod +x $workdir/file1_cmd
/usr/bin/time ./CASTING cast ./configure
date
TEND=echo "print time();" | perl
echo "++++ Total elapsed time expr $TEND - $TBEGIN seconds"
Run:- sbatch run.sh
Upvotes: 1
Views: 363