zwlayer
zwlayer

Reputation: 1824

Check a gpu memory periodically and run script when it gets free

I have 4 GPUs (Nvidia) in my system. I want to check if a specific GPU is free (e.g. if the free memory is more than 10GB) periodically and if it is free I want to run a python script.

I think I can use nvidia-smi to check how much free memory I have for a given gpu. I have an idea but I couldn't complete the script completely. Is there anybody to help me ?

Here what I have written so far:

check.sh

id=$1
free_mem=$(nvidia-smi --query-gpu=memory.free --format=csv -i $id)
echo $free_mem # this prints out: memory.free [MiB] 1954 MiB
while [ $free_mem -lt 10000 ]
    free_mem=$(nvidia-smi --query-gpu=memory.free --format=csv -i $id)
    sleep 5

CUDA_VISIBLE_DEVICES=$id python run_python_file.py

I believe the code should be something similar to the snippet above however I couldn't find out the details.

Upvotes: 4

Views: 2741

Answers (1)

builder-7000
builder-7000

Reputation: 7627

Use grep -Eo [0-9]+ to match one or more consecutive digits:

id=$1
free_mem=$(nvidia-smi --query-gpu=memory.free --format=csv -i $id | grep -Eo [0-9]+)

while [ $free_mem -lt 10000 ]; do
    free_mem=$(nvidia-smi --query-gpu=memory.free --format=csv -i $id | grep -Eo [0-9]+)
    sleep 5
done

CUDA_VISIBLE_DEVICES=$id python run_python_file.py

An alternative to grep would be to use a POSIX regex: sed 's/[^[:digit:]]*\([[:digit:]]\+\).*/\1/'.

Upvotes: 6

Related Questions