Faust
Faust

Reputation: 1

Neural network for linear regression using tensorflow

I just started learning tensorflow and was implementing a neural network for linear regression. I was following some of the online tutorials available was able to write the code. I am using no activation function and I am using MSE(tf.reduce_sum(tf.square(output_layer - y))). When I run the code I am getting Nan as prediction accuracy. The code that I used is given below

# Placeholders
X = tf.placeholder("float", shape=[None, x_size])
y = tf.placeholder("float")

w_1 = tf.Variable(tf.random_normal([x_size, 1], seed=seed))

output_layer = tf.matmul(X, w_1)
predict = output_layer

cost = tf.reduce_sum(tf.square(output_layer - y))
optimizer =  tf.train.GradientDescentOptimizer(0.0001).minimize(cost)

sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)


for epoch in range(100):
        # Train with each example
        for i in range(len(train_X)):
            sess.run(optimizer, feed_dict={X: train_X[i: i + 1], y: train_y[i: i + 1]})

            train_accuracy = np.mean(sess.run(predict, feed_dict={X: train_X, y: train_y}))
            test_accuracy  = np.mean(sess.run(predict, feed_dict={X: test_X, y: test_y}))

            print("Epoch = %d, train accuracy = %.2f%%, test accuracy = %.2f%%"
            % (epoch + 1, 100. * train_accuracy, 100. * test_accuracy))


# In[121]:

sess.close() 

A sample output is given below

Epoch = 1, train accuracy = -2643642714558682640372224491520000.000000%, test accuracy = -2683751730046365038353121175142400.000000%
Epoch = 1, train accuracy = 161895895004931631079134808611225600.000000%, test accuracy = 165095877160981392686228427295948800.000000%
Epoch = 1, train accuracy = -18669546053716288450687958380235980800.000000%, test accuracy = -19281734142647757560839513130087219200.000000%
Epoch = 1, train accuracy = inf%, test accuracy = inf%
Epoch = 1, train accuracy = nan%, test accuracy = nan%

Any help is appreciated. Also if you can provide debugging tips that would be really great.

Thanks.

NOTE: When I run for single batch, the predicted value is becoming too large

sess.run(optimizer, feed_dict={X: train_X[0:1], y: train_y[0:1]})
sess.run(optimizer, feed_dict={X: train_X[1:2], y: train_y[1:2]})
sess.run(optimizer, feed_dict={X: train_X[2:3], y: train_y[2:3]})
print(sess.run(predict, feed_dict={X: train_X[3:4], y: train_y[3:4]}))

Output

[[  1.64660544e+08]]

NOTE: When I reduce the learing_rate to a samll value(1e-8), its kinda of working. Still, the higher learing_rate worked fine when I was running regression on the same dataset. So was the high learing rate the issue here?

Upvotes: 0

Views: 557

Answers (1)

nessuno
nessuno

Reputation: 27070

cost = tf.reduce_sum(tf.square(output_layer - y))

at this line, you're computing the sum of every tensor in the batch, where the batch is a batch of squared difference.

This is ok if your batch has size 1 (stochastic gradient descent), instead, since you want to do mini-batch gradient descent (batch size > 1), you wanto to minimize the average error over the batch.

Thus, you want to minimize this function:

cost = tf.reduce_mean(tf.square(output_layer - y))

tf.reduce_mean computes the mean of the elements in its input.

If the batch size is 1 the formula behaves exactly as the one you used before, but when the batch size is greater than 1 it computes the mean squared error over the batch, that's what you want.

Upvotes: 1

Related Questions