How does Multi-GPU scale in terms of memory allocation?

Question

I have a PC with the following specs:

Processor: AMD Ryzen Threadripper 2990wx (32 cores)
RAM: 32GB
Graphics cards: (GPU:1) GTX1080 Ti (11GB), (GPU:0) GTX1070 (8GB)
SSD: 2TB Samsung Evo 890

My question is, when I run my training program using Keras on roughly 60k images (GPU:1), the program loads the images and the data matrix is 12922.20MB

After this, the program doesn't do anything for a minute and is killed automatically. The same code seems to be training on GPU:1 and working fine with 10k images.

Could this be because my GPU:1 can store only 11GB and the the data size is around 12GB?
Would parallelising GPU:1 and GPU:0 solve my problem? If so, would it be 16GB VRAM(8+8) or 19GB (11+8)?
Am I doing something wrong? The post I am referring is: https://www.pyimagesearch.com/2018/09/10/keras-tutorial-how-to-get-started-with-keras-deep-learning-and-python with some minor modifications.

I did try to search online and on SO but I couldnt find/understand much information on how GPU memory is allocated/scales when using Multi-GPU with Keras.

Any help would be appreciated!

How does Multi-GPU scale in terms of memory allocation?

Answers (1)

Outside the strategy scope

Related Questions