Reputation: 1291
When I run the following code, I am getting folders created named cp_1, cp_2 while I want to save checkpoint files with every epoch. Then I want to use the latest saved checkpoint file to load the weights for my model instance with model.load_weights(tf.train.latest_checkpoint('model_checkpoints_5000'))
how can I do it please?
import os
import tensorflow as tf
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D
# Use the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
x_train = x_train / 255.0
x_test = x_test / 255.0
# Use smaller subset -- speeds things up
x_train = x_train[:10000]
y_train = y_train[:10000]
x_test = x_test[:1000]
y_test = y_test[:1000]
# define a function that creates a new instance of a simple CNN.
def create_model():
model = Sequential([
Conv2D(filters=16, input_shape=(32, 32, 3), kernel_size=(3, 3),
activation='relu', name='conv_1'),
Conv2D(filters=8, kernel_size=(3, 3), activation='relu', name='conv_2'),
MaxPooling2D(pool_size=(4, 4), name='pool_1'),
Flatten(name='flatten'),
Dense(units=32, activation='relu', name='dense_1'),
Dense(units=10, activation='softmax', name='dense_2')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
checkpoint_5000_path = './model_checkpoints_5000/cp_{epoch:02d}'
checkpoint_5000 = ModelCheckpoint(filepath = checkpoint_5000_path,
save_weights= True,
save_freq = 'epoch',
verbose = 1)
model = create_model()
model.fit(x = x_train,
y = y_train,
epochs = 3,
validation_data = (x_test, y_test),
batch_size = 10,
callbacks = [checkpoint_5000])
My output is the following.
Epoch 00001: saving model to ./model_checkpoints_5000\cp_01
INFO:tensorflow:Assets written to: ./model_checkpoints_5000\cp_01\assets
Epoch 2/3
1000/1000 [==============================] - 3s 3ms/step - loss: 1.4493 - accuracy: 0.4744 - val_loss: 1.4664 - val_accuracy: 0.4770
I have tried adding .h5 to
'./model_checkpoints_5000/cp_{epoch:02d}.h5'.
however, then if I try tf.train.latest_checkpoint('model_checkpoints_5000'), I get None? while I should be getting the file name cp_03.h5?
Upvotes: 0
Views: 756
Reputation:
You need to use below code after training the model:
checkpoint_dir = os.path.dirname(checkpoint_5000_path)
os.listdir(checkpoint_dir)
Output:
['cp_01',
'cp_00.h5',
'cp_03',
'cp_00.data-00000-of-00001',
'cp_00.index',
'cp_03.h5',
'cp_02',
'cp_01.h5',
'cp_02.h5',
'checkpoint']
Please check this link for more details.
Upvotes: 1