learner
learner

Reputation: 959

tf.keras.callbacks.ModelCheckpoint Type Error : Unable to serialize 1.0000000656873453e-05 to JSON

I am creating my custom layers tf.keras model using mobile net pretrained layer. Model training is running fine but when saving the best picked model it is giving an error. Below is the snippet of the code that I used

pretrained_model = tf.keras.applications.MobileNetV2(
                                                    weights='imagenet',
                                                    include_top=False,
                                                    input_shape=[*IMAGE_SIZE, IMG_CHANNELS])
pretrained_model.trainable = True #fine tuning
model = tf.keras.Sequential([
                            tf.keras.layers.Lambda(# Convert image from int[0, 255] to the format expect by this model
                            lambda data:tf.keras.applications.mobilenet.preprocess_input(
                                tf.cast(data, tf.float32)), input_shape=[*IMAGE_SIZE, 3]),
                            pretrained_model,
                            tf.keras.layers.GlobalAveragePooling2D()])

model.add(tf.keras.layers.Dense(64, name='object_dense',kernel_regularizer=tf.keras.regularizers.l2(l2=0.001)))
model.add(tf.keras.layers.BatchNormalization(scale=False, center = False))
model.add(tf.keras.layers.Activation('relu', name='relu_dense_64'))
model.add(tf.keras.layers.Dropout(rate=0.2, name='dropout_dense_64'))
model.add(tf.keras.layers.Dense(32, name='object_dense_2',kernel_regularizer=tf.keras.regularizers.l2(l2=0.01)))
model.add(tf.keras.layers.BatchNormalization(scale=False, center = False))
model.add(tf.keras.layers.Activation('relu', name='relu_dense_32'))
model.add(tf.keras.layers.Dropout(rate=0.2, name='dropout_dense_32'))
model.add(tf.keras.layers.Dense(16, name='object_dense_16', kernel_regularizer=tf.keras.regularizers.l2(l2=0.01)))
model.add(tf.keras.layers.Dense(len(CLASS_NAMES), activation='softmax', name='object_prob'))
m1 = tf.keras.metrics.CategoricalAccuracy()
m2 = tf.keras.metrics.Recall()
m3 = tf.keras.metrics.Precision()



optimizers = [
    tfa.optimizers.AdamW(learning_rate=lr * .001 , weight_decay=wd),
    tfa.optimizers.AdamW(learning_rate=lr, weight_decay=wd)
           ]

optimizers_and_layers = [(optimizers[0], model.layers[0]), (optimizers[1], model.layers[1:])]

optimizer = tfa.optimizers.MultiOptimizer(optimizers_and_layers)

model.compile(
    optimizer= optimizer,
    loss = 'categorical_crossentropy',
    metrics=[m1, m2, m3],
    )

checkpoint_path = os.getcwd() + os.sep + 'keras_model'
checkpoint_cb = tf.keras.callbacks.ModelCheckpoint(filepath=os.path.join(checkpoint_path), 
                                                    monitor = 'categorical_accuracy',
                                                    save_best_only=True,
                                                    save_weights_only=True)

history = model.fit(train_data, validation_data=test_data, epochs=N_EPOCHS, callbacks=[checkpoint_cb])

At tf.keras.callbacks.ModelCheckpoint is giving me an error

TypeError: Unable to serialize 1.0000000656873453e-05 to JSON. Unrecognized type <class 'tensorflow.python.framework.ops.EagerTensor'>.

Below is the link to the Google Colab notebook in case you want to replicate the issue

https://colab.research.google.com/drive/1wQbUFfhtDaB5Xta574UkAXJtthui7Bt9?usp=sharing

Upvotes: 4

Views: 7782

Answers (1)

rchome
rchome

Reputation: 2723

This seems to be a bug in Tensorflow or Keras. The tensor that's being serialized to JSON is from your optimizer definition.

model.optimizer.optimizer_specs[0]["optimizer"].get_config()["weight_decay"]

<tf.Tensor: shape=(), dtype=float32, numpy=1.0000001e-05>

From the implementation of tfa.optimizers.AdamW, the weight_decay is serialized using tf.keras.optimizers.Adam._serialize_hyperparameter. This function assumes that if you pass in a callable for the hyperparameter, it returns a non-tensor value when called, but in your notebook, it was implemented as

wd = lambda: 1e-02 * schedule(step)

where schedule() returns a Tensor. I tried some various ways to try to convert the tensor to a scalar value, but I couldn't get them to work. As a workaround, I implemented wd as a LearningRateSchedule so it'll serialize properly, though the code was clunkier. Replacing the definitions of wd and lr with this code allowed model training to complete for me without any issues.

class MyExponentialDecay(tf.keras.optimizers.schedules.ExponentialDecay):
  def __call__(self, step):
    return 1e-2 * super().__call__(step)

wd = MyExponentialDecay(
    initial_learning_rate,
    decay_steps=14,
    decay_rate=0.8,
    staircase=True)
lr = 1e2 * schedule(step)

After training completes, the model.save() call will fail. I believe this is the same issue which was reported here in the Tensorflow Addons Github. The summary of this issue is that the get_config() function for the optimizers will include a "gv" key in the config which stores Tensor objects, which aren't JSON serializable.

At the time of writing, this issue has not been resolved yet. If you don't need the optimizer state for the final saved model, you can pass in the include_optimizer=False argument to model.save() which worked for me. Otherwise, you may need to patch the library or the specific optimizer class implementation to get rid of the "gw" key in the config like the OP did in that thread.

Upvotes: 3

Related Questions