Reputation: 33
I am trying to set neural network with few layers which will solve simple regression problem which should be f(x) = 0,1x or f(x) = 10x
All the code is showed below (generation of data and neural network)
problem is after I am running it the output and loss function are turning into NaN value:
And the output layer: [NaN, NaN, NaN, ..... , NaN]
I am new to tensorflow and I am not sure what I might be doing wrong (badly implement next batch, learning, session implementation)
import tensorflow as tf
import sys
import numpy
#prepraring input data -> X
learningTestData = numpy.arange(1427456).reshape(1394,1024)
#preparing output data -> f(X) =0.1X
outputData = numpy.arange(1427456).reshape(1394,1024)
xx = outputData.shape
dd = 0
while dd < xx[0]:
jj = 0
while jj < xx[1]:
outputData[dd,jj] = outputData[dd,jj] / 10
jj += 1
dd += 1
#preparing the NN
x = tf.placeholder(tf.float32, shape=[None, 1024])
y = tf.placeholder(tf.float32, shape=[None, 1024])
full1 = tf.contrib.layers.fully_connected(inputs=x, num_outputs=1024, activation_fn=tf.nn.relu)
full1 = tf.layers.batch_normalization(full1)
full2 = tf.contrib.layers.fully_connected(inputs=full1, num_outputs=5000, activation_fn=tf.nn.relu)
full2 = tf.layers.batch_normalization(full2)
full3 = tf.contrib.layers.fully_connected(inputs=full2, num_outputs=2500, activation_fn=tf.nn.relu)
full3 = tf.layers.batch_normalization(full3)
full4 = tf.contrib.layers.fully_connected(inputs=full3, num_outputs=1024, activation_fn=tf.nn.relu)
full4 = tf.layers.batch_normalization(full4)
out = tf.contrib.layers.fully_connected(inputs=full4, num_outputs=1024, activation_fn=None)
epochs = 20
batch_size = 50
learning_rate = 0.001
batchOffset = 0
# Loss (RMSE) and Optimizer
cost = tf.losses.mean_squared_error(labels=y, predictions=out)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)
with tf.Session() as sess:
# Initializing the variables
sess.run(tf.global_variables_initializer())
e = 0
while e < epochs:
#selecting next batch
sb = batchOffset
eb = batchOffset+batch_size
x_batch = learningTestData[sb:eb, :]
y_batch = outputData[sb:eb, :]
#learn
opt = sess.run(optimizer,feed_dict={x: x_batch, y: y_batch})
#show RMSE
c = sess.run(cost, feed_dict={x: x_batch, y: y_batch})
print("epoch: {}, optimizer: {}, loss: {}".format(e, opt, c))
batchOffset += batch_size
e += 1
Upvotes: 2
Views: 3707
Reputation: 8585
You need to normalize your data because your gradients, and as a result cost
, are exploding. Try to run this code:
learning_rate = 0.00000001
x_batch = learningTestData[:10]
y_batch = outputData[:10]
with tf.Session() as sess:
# Initializing the variables
sess.run(tf.global_variables_initializer())
opt = sess.run(optimizer,feed_dict={x: x_batch, y: y_batch})
c = sess.run(cost, feed_dict={x: x_batch, y: y_batch})
print(c) # 531492.3
In this case you will get the finite values because the the gradients haven't taken the cost
to infinity. Use normalized data, reduce learning rate or reduce batch size to make it work.
Upvotes: 2