Reputation: 35
I was having an issue with my Tensorflow model and decided to try Keras. It appears to me at least that I am creating the same model with the same parameters, but the Tensorflow model just outputs the mean value of train_y while the Keras model actually varies according the input. Am I missing something in my tf.Session? I usually use Tensorflow and have never had a problem like this. Tensorflow Code:
score_inputs = tf.placeholder(np.float32, shape=(None, 100))
targets = tf.placeholder(np.float32, shape=(None), name="targets")
l2 = tf.contrib.layers.l2_regularizer(0.01)
first_layer = tf.layers.dense(score_inputs, 100, activation=tf.nn.relu, kernel_regularizer=l2)
outputs = tf.layers.dense(first_layer, 1, activation = None, kernel_regularizer=l2)
optimizer = tf.train.AdamOptimizer(0.001)
l2_loss = tf.losses.get_regularization_loss()
loss = tf.reduce_mean(tf.square(tf.subtract(targets, outputs)))
loss += l2_loss
rmse = tf.sqrt(tf.reduce_mean(tf.square(outputs - targets)))
mae = tf.reduce_mean(tf.sqrt(tf.square(outputs - targets)))
training_op = optimizer.minimize(loss)
batch_size = 32
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(10):
avg_train_error = []
for i in range(len(train_x) // batch_size):
batch_x = train_x[i*batch_size: (i+1)*batch_size]
batch_y = train_y[i*batch_size: (i+1)*batch_size]
_, train_loss = sess.run([training_op, loss], {score_inputs: batch_x, targets: batch_y})
feed = {score_inputs: test_x, targets: test_y}
test_loss, test_mae, test_rmse, test_ouputs = sess.run([loss, mae, rmse, outputs], feed)
This has a mean absolute error of 0.682 and root mean squared error of 0.891.
The Keras Code:
inputs = Input(shape=(100,))
hidden = Dense(100, activation="relu", kernel_regularizer = regularizers.l2(0.01))(inputs)
outputs = Dense(1, activation=None, kernel_regularizer = regularizers.l2(0.01))(hidden)
model = Model(inputs=inputs, outputs=outputs)
model.compile(optimizer=keras.optimizers.Adam(lr=0.001), loss='mse', metrics=['mae'])
model.fit(train_x, train_y, batch_size=32, epochs=10, shuffle=False)
keras_pred = model.predict(test_x)
This has a mean absolute error of 0.601 and root mean square error of 0.753.
It appears to me that I am defining the same network in both instances, yet as I said the Tensorflow model only outputs the mean value of train_y, while the Keras model performs a lot better. Any suggestions?
Upvotes: 3
Views: 682
Reputation: 2010
@Priyank Pathak and @lehiester have given some valid points. Taking their suggestions into account, I can suggest you to change the following things and check again:
kernel_initializer
and data_type
random
, numpy
and tensorflow
functionsUpvotes: 1
Reputation: 474
I'm going to try to point out the differences between the two codes.
Keras documentation here shows that the weights are initialized by 'glorot_uniform' whereas your weights are initialized by default, most probably at random as the documentation doesn't clearly specify what it is tensorflow intialization. So initialization is most probably different and it definitely matters.
The second difference most probably is because of the difference in the data type of input, one being numpy.float32 and other being keras default input type, which again hasn't been specified by the documentation
Upvotes: 1
Reputation: 900
There isn't any obvious difference in the models, but the different results could possibly be explained due to random variation in training. Especially since you're only training for 10 epochs, the results could be fairly sensitive to the randomly chosen initial weights for the models.
Try running with more epochs (e.g. 1000) and running each one several times (e.g. 5)--the average results should be fairly close.
Upvotes: 0