How to limit RAM usage while batch training in tensorflow?

Question

I am training a deep neural network with a large image dataset in mini-batches of size 40. My dataset is in .mat format (which I can easily change to any other format e.g. .npy format if necessitates) and before training, loaded as a 4-D numpy array. My problem is that while training, cpu-RAM (not GPU RAM) is very quickly exhausting and starts using almost half of my Swap memory.

My training code has the following pattern:

batch_size = 40
...
with h5py.File('traindata.mat', 'r') as _data:
    train_imgs = np.array(_data['train_imgs'])

# I can replace above with below loading, if necessary
# train_imgs = np.load('traindata.npy')

...

shape_4d = train_imgs.shape 
for epoch_i in range(max_epochs):
    for iter in range(shape_4d[0] // batch_size):
        y_ = train_imgs[iter*batch_size:(iter+1)*batch_size]
        ...
        ...

This seems like the initial loading of the full training data is itself becoming the bottle-neck (taking over 12 GB cpu RAM before I abort).

What is the best efficient way to tackle this bottle-neck?

Thanks in advance.

How to limit RAM usage while batch training in tensorflow?

Answers (1)

Related Questions