Reputation: 1304
I'm loading a keras model that I previously trained, to initialize another network with his weights. Unfortunately, the model I load fills my entire memory making the training of the new model impossible. Here is the code :
import gc
import keras
from keras.models import model_from_json
def loadModel (path, loss=None, optimizer=None):
with open(path + '/model.json', 'r') as f:
model = model_from_json(f.read())
model.load_weights(path + '/model.h5')
if loss and optimizer:
model.compile(loss=loss, optimizer=optimizer)
return model
model = loadModel('the/path/to/my/model')
# The GPU memory is filled
keras.backend.clear_session()
# memory still filled
del model
gc.collect()
# memory still filled
I checked multiple posts, and usually gc.collect()
or clear_session()
does the trick, but for me, it doesn't work so far. Any idea?
PS: I'm using tensorflow as backend.
Upvotes: 0
Views: 5260
Reputation: 1304
As @MatiasValdenegro said, tensorflow allocate the entire memory, that's why I couldn't see the difference after deleting the model. So I basically loaded my pre-trained model, created my new model and initialized his weights with those from the pre-trained model. After that, I deleted the pre-trained model using del model
and gc.collect()
. Since the new model has one more layer than the pre-trained one, I had to reduce my batch size to not run out of memory.
Upvotes: 1