Reputation: 919
So I am using the ModelCheckpoint callback to save the best epoch of a model I am training. It saves with no errors, but when I try to load it, I get the error:
2019-07-27 22:58:04.713951: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open C:\Users\Riley\PycharmProjects\myNN\cp.ckpt: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
I have tried using the absolute/full path, but no luck. I'm sure I could use EarlyStopping, but I'd still like to understand why I am getting the error. Here is my code:
from __future__ import absolute_import, division, print_function
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
import datetime
import statistics
(train_images, train_labels), (test_images, test_labels) = np.load("dataset.npy", allow_pickle=True)
train_images = train_images / 255
test_images = test_images / 255
train_labels = list(map(float, train_labels))
test_labels = list(map(float, test_labels))
train_labels = [i/10 for i in train_labels]
test_labels = [i/10 for i in test_labels]
'''
model = keras.Sequential([
keras.layers.Flatten(input_shape=(128, 128)),
keras.layers.Dense(64, activation=tf.nn.relu),
keras.layers.Dense(1)
])
'''
start_time = datetime.datetime.now()
model = keras.Sequential([
keras.layers.Conv2D(32, kernel_size=(5, 5), strides=(1, 1), activation='relu', input_shape=(128, 128, 1)),
keras.layers.MaxPooling2D(pool_size=(2, 2), strides=(2, 2)),
keras.layers.Dropout(0.2),
keras.layers.Conv2D(64, (5, 5), activation='relu'),
keras.layers.MaxPooling2D(pool_size=(2, 2)),
keras.layers.Dropout(0.2),
keras.layers.Flatten(),
keras.layers.Dropout(0.5),
keras.layers.Dense(1000, activation='relu'),
keras.layers.Dense(1)
])
model.compile(loss='mean_absolute_error',
optimizer=keras.optimizers.SGD(lr=0.01),
metrics=['mean_absolute_error', 'mean_squared_error'])
train_images = train_images.reshape(328, 128, 128, 1)
test_images = test_images.reshape(82, 128, 128, 1)
model.fit(train_images, train_labels, epochs=100, callbacks=[keras.callbacks.ModelCheckpoint("cp.ckpt", monitor='mean_absolute_error', save_best_only=True, verbose=1)])
model.load_weights("cp.ckpt")
predictions = model.predict(test_images)
totalDifference = 0
for i in range(82):
print("%s: %s" % (test_labels[i] * 10, predictions[i] * 10))
totalDifference += abs(test_labels[i] - predictions[i])
avgDifference = totalDifference / 8.2
print("\n%s\n" % avgDifference)
print("Time Elapsed:")
print(datetime.datetime.now() - start_time)
Upvotes: 4
Views: 5848
Reputation: 1547
model.load_weights
will not work here. Reason is mentioned in the above answer.
You can load weights by this code. Load your model first and than load weights. I hope this code will help you out
import tensorflow as tf
model=dense_net()
ckpt = tf.train.Checkpoint(
step=tf.Variable(1, dtype=tf.int64), net=model)
ckpt.restore(tf.train.latest_checkpoint("/kaggle/working/training_1/cp.ckpt.data-00001-of-00002"))
Upvotes: 2
Reputation: 24691
TLDR; you are saving whole model, while trying to load only weights, that's not how it works.
Your model's fit
:
model.fit(
train_images,
train_labels,
epochs=100,
callbacks=[
keras.callbacks.ModelCheckpoint(
"cp.ckpt", monitor="mean_absolute_error", save_best_only=True, verbose=1
)
],
)
As save_weights=False
by default in ModelCheckpoint
, you are saving whole model to .ckpt
.
BTW. File should be named .hdf5
or .hf5
as it's Hierarchical Data Format 5
. As Windows is not extension-agnostic you may run into some problems if tensorflow
/ keras
relies on extension on this OS.
On the other hand you are loading the model's weights only, while the file contains whole model:
model.load_weights("cp.ckpt")
Tensorflow's checkpointing (.cp
) mechanism is different from Keras's (.hdf5
), so watch out for that (there are plans to integrate them more closely, see here and here).
So, either use the callback as you currently do, BUT use model.load("model.hdf5")
or add save_weights_only=True
argument to ModelCheckpoint
:
model.fit(
train_images,
train_labels,
epochs=100,
callbacks=[
keras.callbacks.ModelCheckpoint(
"weights.hdf5",
monitor="mean_absolute_error",
save_best_only=True,
verbose=1,
save_weights_only=True, # Specify this
)
],
)
and you can use your model.load_weights("weights.hdf5")
.
Upvotes: 6
Reputation: 2091
import tensorflow as tf
# Create some variables.
v1 = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="v1")
v2 = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="v2")
# Add an op to initialize the variables.
init_op = tf.global_variables_initializer()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
# Do some work with the model.
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print("Model saved in file: %s" % save_path)
# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "/tmp/model.ckpt")
print("Model restored.")
# Do some work with the model
Upvotes: 0