Reputation: 188
How can I create a job with a multi GPU of the same type but not specific that type directly? My experiment has a constraint that all GPUs have the same type but this type can be whatever we want.
Currently I am able only to create a experiment with multi GPUs with telling exactly what type I want:
--gres=gpu:gres_type:amount
If I don't specify gres_type
, then sometimes I get mixed GPUs packs (let say 2x titan V and 2x titan X).
Upvotes: 0
Views: 488
Reputation: 59250
If you are fortunate enough that the cluster is consistent in the types of nodes that host the GPUs, and that the features
of the nodes a properly specified and allow distinguishing between the nodes that host the different GPU types, you can use the --constraint
parameter.
For the sake of the argument, let's assume that the nodes that host the titanV
have haswell
CPUs, and those that host the titanX
have skylake
CPUs and that those are defined as features. Then, you can request
--gres=gpu:2
--constraint=[haswell|skylake]
If the above does not apply to your use case, you can submit two jobs and keep only the one that starts the earliest. For that, give your jobs an identical name, and use the singleton
dependency.
Write a submission script like this one
#!/bin/bash
#SBATCH --dependency=singleton
#SBATCH --job-name=gpujob
# Other options
scancel --state=PENDING --jobname=gpujob
# etc.
and submit it twice with
$ sbatch --gres=gpu:titanX:2 submit.sh
$ sbatch --gres=gpu:titanV:2 submit.sh
Each job will be assigned only one type of GPU, and the first one that starts will cancel the other one. This approach can scale up with more than two GPU types.
Upvotes: 1