mrgloom
mrgloom

Reputation: 21612

Run several python scripts with parallel

I have run_command_list.txt which contrain one command per line:

time python3 train.py --dataroot ./datasets/maps --name maps_pix2pix --model pix2pix --direction AtoB --checkpoints_dir maps_pix2pix_a_to_b_bs_1 --batch_size 1 > bs_1.log
time python3 train.py --dataroot ./datasets/maps --name maps_pix2pix --model pix2pix --direction AtoB --checkpoints_dir maps_pix2pix_a_to_b_bs_2 --batch_size 2 > bs_2.log
time python3 train.py --dataroot ./datasets/maps --name maps_pix2pix --model pix2pix --direction AtoB --checkpoints_dir maps_pix2pix_a_to_b_bs_4 --batch_size 4 > bs_4.log
...

I want to run not more that 2 jobs in parallel and I want to set CUDA_VISIBLE_DEVICES=0 or CUDA_VISIBLE_DEVICES=1 depending on which GPU is available at the moment, how can I do this using parallel or xargs?

i.e. something like cat run_command_list.txt | xargs -n 1 -P 2

Upvotes: 1

Views: 262

Answers (2)

Ole Tange
Ole Tange

Reputation: 33685

seq 1000 |
  parallel -j2 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =}' time python3 train.py --dataroot ./datasets/maps --name maps_pix2pix --model pix2pix --direction AtoB --checkpoints_dir maps_pix2pix_a_to_b_bs_{} --batch_size {} '>' bs_{}.log

Upvotes: 1

Matias Barrios
Matias Barrios

Reputation: 5056

You can do this :

function GET_AVAILABLE_DEVICE() {
    [[ SOMETHING_HERE == SOMETHING ]] && echo 0 || echo 1
}

CUDA_VISIBLE_DEVICES=$( GET_AVAILABLE_DEVICE ) time python3 train.py --dataroot ./datasets/maps --name maps_pix2pix --model pix2pix --direction AtoB --checkpoints_dir maps_pix2pix_a_to_b_bs_1 --batch_size 1 > bs_1.log &
CUDA_VISIBLE_DEVICES=$( GET_AVAILABLE_DEVICE ) time python3 train.py --dataroot ./datasets/maps --name maps_pix2pix --model pix2pix --direction AtoB --checkpoints_dir maps_pix2pix_a_to_b_bs_2 --batch_size 2 > bs_2.log &
CUDA_VISIBLE_DEVICES=$( GET_AVAILABLE_DEVICE ) time python3 train.py --dataroot ./datasets/maps --name maps_pix2pix --model pix2pix --direction AtoB --checkpoints_dir maps_pix2pix_a_to_b_bs_4 --batch_size 4 > bs_4.log &
wait

You need to replace SOMETHING_HERE== SOMETHING with whatever command will give you your available device.

Upvotes: 0

Related Questions