Zhao
Zhao

Reputation: 2193

Show GPU memory usage and utilization for a slurm job

I am using slurm to access GPU resources. Is it possible to show GPU usage for a running slurm job? Just like using nvidia-smi in a normal interactive shell.

Upvotes: 6

Views: 13059

Answers (3)

Zhang Kin
Zhang Kin

Reputation: 89

I think most of the users, don't have the permission to compute-node, I mean ssh node-1

Here is the way in slurm,

  1. Check your jobid:
squeue -u <your_username>

and you will get the jobid for that.

  1. srun within your jobid with nvidia-smi
srun --jobid=123456 nvidia-smi

Upvotes: 5

dtlam26
dtlam26

Reputation: 1600

I suggest trying to launch your application manually in jupyter and access the terminal shell in jupyter.

Upvotes: 0

masterchief
masterchief

Reputation: 51

you can use ssh to login your job's node. Then use nvidia-smi. It works for me. For example, I use squeue check my job xxxxxx is current running at node x-x-x. Then I use ssh x-x-x to access to that node. After that, you can use nvidia-smi to check the usage of GPUs.

Upvotes: 5

Related Questions