ProEns08
ProEns08

Reputation: 1914

keras (tensorflow backend) run on a cluster using slurm

I have the opportunity to run my Tensorflow training on a cluster computer with slurm workload manager (the cluster contains nearly 400000 cores, 40000 GB of RAM, Performance is Rmax=500 TFlop/s and Rpeak=1000 TFlop/s, AMD GPUs).

I work on image processing projects using deep learning algorithms.

My question is how to scale my keras deep learning to run on this cluster using slurm as workload manager ?

Upvotes: 2

Views: 1461

Answers (1)

Michael S.
Michael S.

Reputation: 21

Use Horovod to scale out the Keras training - https://github.com/uber/horovod

Upvotes: 2

Related Questions