user9933193
user9933193

Reputation: 35

Tensorflow model performing significantly worse than Keras model

I was having an issue with my Tensorflow model and decided to try Keras. It appears to me at least that I am creating the same model with the same parameters, but the Tensorflow model just outputs the mean value of train_y while the Keras model actually varies according the input. Am I missing something in my tf.Session? I usually use Tensorflow and have never had a problem like this. Tensorflow Code:

score_inputs = tf.placeholder(np.float32, shape=(None, 100))
targets = tf.placeholder(np.float32, shape=(None), name="targets")

l2 = tf.contrib.layers.l2_regularizer(0.01)

first_layer = tf.layers.dense(score_inputs, 100, activation=tf.nn.relu, kernel_regularizer=l2)
outputs = tf.layers.dense(first_layer, 1, activation = None, kernel_regularizer=l2)

optimizer = tf.train.AdamOptimizer(0.001)
l2_loss = tf.losses.get_regularization_loss()
loss = tf.reduce_mean(tf.square(tf.subtract(targets, outputs)))
loss += l2_loss
rmse = tf.sqrt(tf.reduce_mean(tf.square(outputs - targets)))
mae = tf.reduce_mean(tf.sqrt(tf.square(outputs - targets)))
training_op = optimizer.minimize(loss)

batch_size = 32

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for epoch in range(10):
        avg_train_error = []
        for i in range(len(train_x) // batch_size):
            batch_x = train_x[i*batch_size: (i+1)*batch_size]
            batch_y = train_y[i*batch_size: (i+1)*batch_size]
            _, train_loss = sess.run([training_op, loss], {score_inputs: batch_x, targets: batch_y})

    feed = {score_inputs: test_x, targets: test_y}
    test_loss, test_mae, test_rmse, test_ouputs = sess.run([loss, mae, rmse, outputs], feed)

This has a mean absolute error of 0.682 and root mean squared error of 0.891.

The Keras Code:

inputs = Input(shape=(100,))
hidden = Dense(100, activation="relu", kernel_regularizer = regularizers.l2(0.01))(inputs)
outputs = Dense(1, activation=None, kernel_regularizer = regularizers.l2(0.01))(hidden)
model = Model(inputs=inputs, outputs=outputs)
model.compile(optimizer=keras.optimizers.Adam(lr=0.001), loss='mse', metrics=['mae'])
model.fit(train_x, train_y, batch_size=32, epochs=10, shuffle=False)
keras_pred = model.predict(test_x)

This has a mean absolute error of 0.601 and root mean square error of 0.753.

It appears to me that I am defining the same network in both instances, yet as I said the Tensorflow model only outputs the mean value of train_y, while the Keras model performs a lot better. Any suggestions?

Upvotes: 3

Views: 682

Answers (3)

Anakin
Anakin

Reputation: 2010

@Priyank Pathak and @lehiester have given some valid points. Taking their suggestions into account, I can suggest you to change the following things and check again:

  1. Use same kernel_initializer and data_type
  2. Use more epochs for better generalisation
  3. Seed your random, numpy and tensorflow functions

Upvotes: 1

Priyank Pathak
Priyank Pathak

Reputation: 474

I'm going to try to point out the differences between the two codes.

Keras documentation here shows that the weights are initialized by 'glorot_uniform' whereas your weights are initialized by default, most probably at random as the documentation doesn't clearly specify what it is tensorflow intialization. So initialization is most probably different and it definitely matters.

The second difference most probably is because of the difference in the data type of input, one being numpy.float32 and other being keras default input type, which again hasn't been specified by the documentation

Upvotes: 1

lehiester
lehiester

Reputation: 900

There isn't any obvious difference in the models, but the different results could possibly be explained due to random variation in training. Especially since you're only training for 10 epochs, the results could be fairly sensitive to the randomly chosen initial weights for the models.

Try running with more epochs (e.g. 1000) and running each one several times (e.g. 5)--the average results should be fairly close.

Upvotes: 0

Related Questions