Reputation: 660
When I use TensorFlow to calculate a simple linear regression I get [nan], including: w, b and loss.
Here is my code:
import tensorflow as tf
w = tf.Variable(tf.zeros([1]), tf.float32)
b = tf.Variable(tf.zeros([1]), tf.float32)
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)
liner = w*x+b
loss = tf.reduce_sum(tf.square(liner-y))
train = tf.train.GradientDescentOptimizer(1).minimize(loss)
sess = tf.Session()
x_data = [1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000]
y_data = [265000, 324000, 340000, 412000, 436000, 490000, 574000, 585000, 680000]
sess.run(tf.global_variables_initializer())
for i in range(1000):
sess.run(train, {x: x_data, y: y_data})
nw, nb, nloss = sess.run([w, b, loss], {x: x_data, y: y_data})
print(nw, nb, nloss)
Output:
[ nan] [ nan] nan
Process finished with exit code 0
Why does this happen, and how can I fix it?
Upvotes: 0
Views: 954
Reputation: 6220
This gives the explanation I believe:
for i in range(10):
print(sess.run([train, w, b, loss], {x: x_data, y: y_data}))
Gives the following result:
[None, array([ 4.70380012e+10], dtype=float32), array([ 8212000.], dtype=float32), 2.0248419e+12]
[None, array([ -2.68116614e+19], dtype=float32), array([ -4.23342041e+15], dtype=float32),
6.3058345e+29]
[None, array([ 1.52826476e+28], dtype=float32), array([ 2.41304958e+24], dtype=float32), inf] [None, array([
-8.71110858e+36], dtype=float32), array([ -1.37543819e+33], dtype=float32), inf]
[None, array([ inf], dtype=float32), array([ inf], dtype=float32), inf]
Your learning rate is simply too big, so you "overcorrect" the value of w
at each iteration (see as it oscillates between negative and positive, with increasing absolute value). You get higher and higher values, until something reaches infinity, which creates Nan values. Just lower (a lot) the learning rate.
Upvotes: 1
Reputation: 350
You are overflowing by using such a high learning rate (1 in your case). Try with a learning rate of 0.001. Also your data needs to be divided by 1000 and the number of iterations increased and it should work. This is the code I tested and works perfectly.
x_data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y_data = [265, 324, 340, 412, 436, 490, 574, 585, 680]
plt.plot(x_data, y_data, 'ro', label='Original data')
plt.legend()
plt.show()
W = tf.Variable(tf.random_uniform([1], 0, 1))
b = tf.Variable(tf.zeros([1]))
y = W * x_data + b
loss = tf.reduce_mean(tf.square(y - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.001)
train = optimizer.minimize(loss)
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
for step in range(0,50000):
sess.run(train)
print(step, sess.run(loss))
print (step, sess.run(W), sess.run(b))
plt.plot(x_data, y_data, 'ro')
plt.plot(x_data, sess.run(W) * x_data + sess.run(b))
plt.legend()
plt.show()
Upvotes: 1