a_parida
a_parida

Reputation: 626

reset memory usage of a single GPU

I have access to 4 GPUs(not root user). One of the GPU(no. 2) behaves weird, their is some memory blocked but the power consumption and temperature is very low(as if nothing is running on it). See details from nvidia-smi in the image below: nvidia-smi output

How can I reset the GPU 2 without disturbing the processes running on the other GPUs?

PS: I am not a root user but I think I can catch hold of some root user as well.

Upvotes: 6

Views: 11249

Answers (1)

Arash Mohammadi
Arash Mohammadi

Reputation: 266

resetting a gpu can resolve you problem somehow it could be impossible due your GPU configuration

nvidia-smi --gpu-reset -i "gpu ID"

for example if you have nvlink enabled with gpus it does not go through always, and also it seems that nvidia-smi in your case is unable to find the process running over your gpu, the solution for your case is finding and killing associated process to that gpu by running following command, fill out the PID with one that are you find by fuser there

fuser -v /dev/nvidia*

kill -9 "PID"

Upvotes: 6

Related Questions