Why do I get [nan] when using TensorFlow to calculate a simple linear regression?

When I use TensorFlow to calculate a simple linear regression I get [nan], including: w, b and loss.

Here is my code:

import tensorflow as tf

w = tf.Variable(tf.zeros([1]), tf.float32)
b = tf.Variable(tf.zeros([1]), tf.float32)
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

liner = w*x+b

loss = tf.reduce_sum(tf.square(liner-y))

train = tf.train.GradientDescentOptimizer(1).minimize(loss)

sess = tf.Session()

x_data = [1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000]
y_data = [265000, 324000, 340000, 412000, 436000, 490000, 574000, 585000, 680000]                                                    

sess.run(tf.global_variables_initializer())

for i in range(1000):
    sess.run(train, {x: x_data, y: y_data})

nw, nb, nloss = sess.run([w, b, loss], {x: x_data, y: y_data})

print(nw, nb, nloss)

Output:

[ nan] [ nan] nan

Process finished with exit code 0

Why does this happen, and how can I fix it?

Upvotes: 0

Answers (2)

gdelab

Reputation: 6220

This gives the explanation I believe:

for i in range(10):
     print(sess.run([train, w, b, loss], {x: x_data, y: y_data}))

Gives the following result:

[None, array([  4.70380012e+10], dtype=float32), array([ 8212000.], dtype=float32), 2.0248419e+12] 
[None, array([ -2.68116614e+19], dtype=float32), array([ -4.23342041e+15], dtype=float32),
6.3058345e+29] 
[None, array([  1.52826476e+28], dtype=float32), array([  2.41304958e+24], dtype=float32), inf] [None, array([
-8.71110858e+36], dtype=float32), array([ -1.37543819e+33], dtype=float32), inf] 
[None, array([ inf], dtype=float32), array([ inf], dtype=float32), inf]

Your learning rate is simply too big, so you "overcorrect" the value of w at each iteration (see as it oscillates between negative and positive, with increasing absolute value). You get higher and higher values, until something reaches infinity, which creates Nan values. Just lower (a lot) the learning rate.

Upvotes: 1

user3217278

Reputation: 350

You are overflowing by using such a high learning rate (1 in your case). Try with a learning rate of 0.001. Also your data needs to be divided by 1000 and the number of iterations increased and it should work. This is the code I tested and works perfectly.

x_data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y_data = [265, 324, 340, 412, 436, 490, 574, 585, 680]

plt.plot(x_data, y_data, 'ro', label='Original data')
plt.legend()
plt.show()

W = tf.Variable(tf.random_uniform([1], 0, 1))
b = tf.Variable(tf.zeros([1]))
y = W * x_data + b

loss = tf.reduce_mean(tf.square(y - y_data))

optimizer = tf.train.GradientDescentOptimizer(0.001)
train = optimizer.minimize(loss)
init = tf.initialize_all_variables()

sess = tf.Session()
sess.run(init)

for step in range(0,50000):
   sess.run(train)
   print(step, sess.run(loss))
print (step, sess.run(W), sess.run(b))

plt.plot(x_data, y_data, 'ro')
plt.plot(x_data, sess.run(W) * x_data + sess.run(b))
plt.legend()
plt.show()

Upvotes: 1

Why do I get [nan] when using TensorFlow to calculate a simple linear regression?

Answers (2)

Related Questions