Neural Network after first epoch generates NaN values as output, loss

Question

I am trying to set neural network with few layers which will solve simple regression problem which should be f(x) = 0,1x or f(x) = 10x

All the code is showed below (generation of data and neural network)

4 fully connected layers with ReLu
loss function RMSE
learning GradientDescent

problem is after I am running it the output and loss function are turning into NaN value:

epoch: 0, optimizer: None, loss: inf
epoch: 1, optimizer: None, loss: nan

And the output layer: [NaN, NaN, NaN, ..... , NaN]

I am new to tensorflow and I am not sure what I might be doing wrong (badly implement next batch, learning, session implementation)

import tensorflow as tf
import sys
import numpy

#prepraring input data -> X
learningTestData = numpy.arange(1427456).reshape(1394,1024)

#preparing output data -> f(X) =0.1X
outputData = numpy.arange(1427456).reshape(1394,1024)

xx = outputData.shape
dd = 0
while dd < xx[0]:
    jj = 0
    while jj < xx[1]:
        outputData[dd,jj] = outputData[dd,jj] / 10
        jj += 1
    dd += 1

#preparing the NN
x = tf.placeholder(tf.float32, shape=[None, 1024])
y = tf.placeholder(tf.float32, shape=[None, 1024])

full1 = tf.contrib.layers.fully_connected(inputs=x, num_outputs=1024, activation_fn=tf.nn.relu)
full1 = tf.layers.batch_normalization(full1)

full2 = tf.contrib.layers.fully_connected(inputs=full1, num_outputs=5000, activation_fn=tf.nn.relu)
full2 = tf.layers.batch_normalization(full2)

full3 = tf.contrib.layers.fully_connected(inputs=full2, num_outputs=2500, activation_fn=tf.nn.relu)
full3 = tf.layers.batch_normalization(full3)

full4 = tf.contrib.layers.fully_connected(inputs=full3, num_outputs=1024, activation_fn=tf.nn.relu)
full4 = tf.layers.batch_normalization(full4)


out = tf.contrib.layers.fully_connected(inputs=full4, num_outputs=1024, activation_fn=None)


epochs = 20
batch_size = 50
learning_rate = 0.001
batchOffset = 0

# Loss (RMSE) and Optimizer
cost = tf.losses.mean_squared_error(labels=y, predictions=out)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)


with tf.Session() as sess:
    # Initializing the variables
    sess.run(tf.global_variables_initializer())

    e = 0

    while e < epochs:

        #selecting next batch
        sb = batchOffset
        eb = batchOffset+batch_size
        x_batch = learningTestData[sb:eb, :]
        y_batch = outputData[sb:eb, :]

        #learn
        opt = sess.run(optimizer,feed_dict={x: x_batch, y: y_batch})
        #show RMSE
        c = sess.run(cost, feed_dict={x: x_batch, y: y_batch})
        print("epoch: {}, optimizer: {}, loss: {}".format(e, opt, c))

        batchOffset += batch_size
        e += 1

Vlad · Accepted Answer

You need to normalize your data because your gradients, and as a result cost, are exploding. Try to run this code:

learning_rate = 0.00000001
x_batch = learningTestData[:10]
y_batch = outputData[:10]
with tf.Session() as sess:
    # Initializing the variables
    sess.run(tf.global_variables_initializer())
    opt = sess.run(optimizer,feed_dict={x: x_batch, y: y_batch})

    c = sess.run(cost, feed_dict={x: x_batch, y: y_batch})
    print(c) # 531492.3

In this case you will get the finite values because the the gradients haven't taken the cost to infinity. Use normalized data, reduce learning rate or reduce batch size to make it work.

Neural Network after first epoch generates NaN values as output, loss

Answers (1)

Related Questions