Schnurrberto
Schnurrberto

Reputation: 41

Evaluation of Regression Neural Network

Hej,

I am trying to write a small program to solve a Regression problem. My dataset is hereby 4 random x (x1,x2,x3 and x4) and 1 y value. One of the rows looks like this:

0.634585    0.552366    0.873447    0.196890    8.75

I know want to predict the y-value as close as possible so after the training I would like to evaluate how good my model is by showing the loss. Unfortunately I always receive

Training cost= nan

The most important lines of could would be:

X_data = tf.placeholder(shape=[None, 4], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)

# Input neurons : 4
# Hidden neurons : 2 x 8
# Output neurons : 3
hidden_layer_nodes = 8

w1 = tf.Variable(tf.random_normal(shape=[4,hidden_layer_nodes])) # Inputs -> Hidden Layer1
b1 = tf.Variable(tf.random_normal(shape=[hidden_layer_nodes]))   # First Bias
w2 = tf.Variable(tf.random_normal(shape=[hidden_layer_nodes,1])) # Hidden layer2 -> Outputs
b2 = tf.Variable(tf.random_normal(shape=[1]))   # Third Bias

hidden_output = tf.nn.relu(tf.add(tf.matmul(X_data, w1), b1))
final_output = tf.nn.relu(tf.add(tf.matmul(hidden_output, w2), b2))

loss = tf.reduce_mean(-tf.reduce_sum(y_target * tf.log(final_output), axis=0))

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
train = optimizer.minimize(loss)

init = tf.global_variables_initializer()

steps = 10000

with tf.Session() as sess:

    sess.run(init)

    for i in range(steps):

        sess.run(train,feed_dict={X_data:X_train,y_target:y_train})

        # PRINT OUT A MESSAGE EVERY 100 STEPS
        if i%500 == 0:

            print('Currently on step {}'.format(i))

    training_cost = sess.run(loss, feed_dict={X_data:X_test,y_target:y_test})
    print("Training cost=", training_cost)

Maybe someone knows where my mistake is or even better, how to constantly show the error during my training :) I know how this is done with the tf.estimator, but not without. If you need the dataset, let me know.

Cheers!

Upvotes: 1

Views: 269

Answers (1)

Nipun Wijerathne
Nipun Wijerathne

Reputation: 1829

This is because the Relu activation function causes the exploding gradient. Therefore, you need to reduce the learning rate accordingly. Moreover, you can try a different activation function also (for this you may have to normalize your dataset first)

Here, (In simple multi-layer FFNN only ReLU activation function doesn't converge) is a similar problem as your case. Follow the answer and you will understand.

Hope this helps.

Upvotes: 1

Related Questions