didi
didi

Reputation: 21

Issue porting code from keras to tf.keras

I am working on porting a basic MNIST model training program from using keras 2.3.1 to tf.keras (Tensorflow 2.0) and am seeing some strange behavior.

My initial code trains very well but after updating my imports the model training gets bogged down.

here is my code

#initialize the network
input_layer = Input(shape=(784,))
network1 = Dense(152, activation=nn.tanh)(input_layer)
network2 = Dense(76, activation=nn.tanh)(network1)
network3 = Dense(38, activation=nn.tanh)(network2)
network4 = Dense(4, activation=nn.tanh)(network3)
network5 = Dense(38, activation=nn.tanh)(network4)
network6 = Dense(76, activation=nn.tanh)(network5)
network7 = Dense(152, activation=nn.tanh)(network6)
output = Dense(784, activation=nn.tanh)(network7)

autoencoder = Model(inputs=input_layer, outputs=output)
autoencoder.compile(optimizer='adadelta', loss='MSE')

# Create a callback that saves the model's weights
cp_callback = ModelCheckpoint(filepath=conf.checkpoint_path, save_weights_only=True, verbose=1)

#run the model
autoencoder.fit(x_data_train, x_data_train,
            epochs=conf.epochs,
            batch_size=conf.batch_size,
            shuffle=True,
            callbacks=[cp_callback])

my keras imports - old:

from keras.layers import Input, Dense
from keras.models import Model
from keras.callbacks import ModelCheckpoint

my keras imports - new:

from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
from tensorflow.keras.callbacks import ModelCheckpoint

After the first epoch of training - old- model finishes around loss of 0.025 after 500 epochs

Epoch 1/20 60000/60000 [==============================] - 11s 180us/step - loss: 0.0614

After the first epoch of training - new - model stalls around loss of 0.06 after 500 epochs

Epoch 1/20 60000/60000 [============================>.] - 10s 167us/step - loss: 0.1152

Upvotes: 0

Views: 158

Answers (2)

didi
didi

Reputation: 21

After searching the internet for a few hours I did find a small comment here. This comment is the answer

keras adadelta learning rate is 1.0, tf adadelta learning rate is 0.001. So latter learns much slower. You might want to adjust learning rate and test for consistency.

tf.keras adadelta uses a default learning rate of 0.001 while keras uses 1.0. Updating my optimizer learning rate worked!

autoencoder.compile(optimizer=optimizers.Adadelta(learning_rate=1.0), loss='MSE')

Upvotes: 1

Piotr Rarus
Piotr Rarus

Reputation: 952

Have you checked your gradients? Learning NN model is non-deterministic, as you're initializing different set of weights each time. Try at least to use predefined random seed. You're not using any regularization, things can't get out of hand quite a bit.

Upvotes: 0

Related Questions