Reputation: 1476
I'm learning TensorFlow and trying to apply it on a simple linear regression problem. data
is numpy.ndarray of shape [42x2].
I'm a bit puzzled why after each succesive epoch the loss is increasing. Isn't the loss expected to to go down with each successive epoch!
Here is my code (let me know, if you'd like me to share the output as well!): (Thanks a lot for taking your time to answer to it.)
1) created the placeholders for dependent / independent variables
X = tf.placeholder(tf.float32, name='X')
Y = tf.placeholder(tf.float32,name='Y')
2) created vars for weight, bias, total_loss (after each epoch)
w = tf.Variable(0.0,name='weights')
b = tf.Variable(0.0,name='bias')
3) defined loss function & optimizer
Y_pred = X * w + b
loss = tf.reduce_sum(tf.square(Y - Y_pred), name = 'loss')
optimizer = tf.train.GradientDescentOptimizer(learning_rate = 0.001).minimize(loss)
4) created summary events & event file writer
tf.summary.scalar(name = 'weight', tensor = w)
tf.summary.scalar(name = 'bias', tensor = b)
tf.summary.scalar(name = 'loss', tensor = loss)
merged = tf.summary.merge_all()
evt_file = tf.summary.FileWriter('def_g')
evt_file.add_graph(tf.get_default_graph())
5) and execute all in a session
with tf.Session() as sess1:
sess1.run(tf.variables_initializer(tf.global_variables()))
for epoch in range(10):
summary, _,l = sess1.run([merged,optimizer,loss],feed_dict={X:data[:,0],Y:data[:,1]})
evt_file.add_summary(summary,epoch+1)
evt_file.flush()
print(" new_loss: {}".format(sess1.run(loss,feed_dict={X:data[:,0],Y:data[:,1]})))
Cheers!
Upvotes: 2
Views: 1725
Reputation: 824
The short answer is that your learning rate is too big. I was able to get reasonable results by changing it from 0.001 to 0.0001, but I only used the 23 points from your second-last comment (I initially didn't notice your last comment), so using all the data might require an even lower number.
0.001 seems like a really low learning rate. However, the real problem is that your loss function is using reduce_sum
instead of reduce_mean
. This causes your loss to be a large number, which sends a very strong signal to the GradientDescentOptimizer, so it's overshooting despite the low learning rate. The problem would only get worse if you added more points to your training data. So use reduce_mean
to get the average squared error and your algorithms will be much better behaved.
Upvotes: 10