Reputation: 1215
I have the following code that takes 3GB physical RAM and 144GB virtual RAM:
model = Sequential()
model.add(Input(shape=(input_shape,)))
model.add(Dense(50, activation='relu', kernel_initializer='he_normal'))
model.add(Dropout(0.1))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=[AUC(curve='PR',
name='auc')])
es = EarlyStopping(monitor='val_auc', patience=10, mode='max', verbose=1)
history = model.fit(X_train, y_train, batch_size=50, verbose=0,
validation_split=0.2, callbacks=[es], epochs=500)
eval_auc = max(history.history['val_auc'])
ix=np.argmax(history.history['val_auc'])
print("Number of interations: ", ix)
print(eval_auc)
The X_train is of shape (44,000, 1,233) and its datatype is np.int8
. It takes 52MB of memory space. I am using tensorflow V2.2. Why does it take so much space? What should I do to reduce the memory usage?
Upvotes: 2
Views: 2272
Reputation: 1
52MB per image = 52000000 * 50 = 2600MB per batch
model.add(Dense(50, activation='relu', kernel_initializer='he_normal'))
weights: 50 * 52MB * 4 = 10400MB (each output * each input * 4 bytes, I assume float32 weights. Actually, it needs 50x4 bytes for the layer values but who cares for a few bytes)
model.add(Dropout(0.1))
1040MB
model.add(Dense(1, activation='sigmoid'))
1040MB
Total memory: 2600 + 10400 + 1040 + 1040 = 15080MB = 14.7GB
Are you sure that you are using int8? Keras backend usually use float32 or float64, lower precision can cause accuracy problems. If the backend uses float32, then each image uses 208MB of memory, a batch of 50 images use 10400MB instead of 2600MB.
Are you sure that your image has only one color channel? If it is loaded as RBG then you need three times more RAM for image.
Do your code loads other data? Minimize the code footprint to the minimum required for the training and see if you still need the same amount of memory.
Upvotes: 0
Reputation: 434
By default, tensorflow pre-allocates nearly all of the available GPU memory, which is bad for a variety of use cases, especially production and memory profiling. When keras uses tensorflow for its back-end, it inherits this behavior.
Tensorflow allows you to change how it allocates GPU memory, and to set a limit on how much GPU memory it is allowed to allocate.
## keras example imports
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import LSTM
## extra imports to set GPU options
import tensorflow as tf
from keras import backend as k
###################################
# TensorFlow wizardry
config = tf.ConfigProto()
# Don't pre-allocate memory; allocate as-needed
config.gpu_options.allow_growth = True
# Only allow a total of half the GPU memory to be allocated
config.gpu_options.per_process_gpu_memory_fraction = 0.5
# Create a session with the above options specified.
k.tensorflow_backend.set_session(tf.Session(config=config))
###################################
model = Sequential()
model.add(Embedding(max_features, output_dim=256))
model.add(LSTM(128))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=16)
To reduce the memory associated further, you can create a Sparse Matrix from a Dense (full) Matrix.
Upvotes: 3