yoshi
yoshi

Reputation: 61

TensorFlow model gets loss 0

import tensorflow as tf
import numpy as np
def weight(shape):
return tf.Variable(tf.truncated_normal(shape, stddev=0.1))
def bias(shape):
return tf.Variable(tf.constant(0.1, shape=shape))
def output(input,w,b):
return tf.matmul(input,w)+b
x_columns = 33
y_columns = 1
layer1_num = 7
layer2_num = 7
epoch_num = 10
train_num = 1000
batch_size = 100
display_size = 1
x = tf.placeholder(tf.float32,[None,x_columns])
y = tf.placeholder(tf.float32,[None,y_columns])

layer1 = 
tf.nn.relu(output(x,weight([x_columns,layer1_num]),bias([layer1_num])))
layer2=tf.nn.relu
(output(layer1,weight([layer1_num,layer2_num]),bias([layer2_num])))
prediction = output(layer2,weight([layer2_num,y_columns]),bias([y_columns]))

loss=tf.reduce_mean
(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=prediction))
train_step = tf.train.AdamOptimizer().minimize(loss)

sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
for epoch in range(epoch_num):
   avg_loss = 0.
   for i in range(train_num):
      index = np.random.choice(len(x_train),batch_size)
      x_train_batch = x_train[index]
      y_train_batch = y_train[index]
      _,c = sess.run([train_step,loss],feed_dict=
{x:x_train_batch,y:y_train_batch})
      avg_loss += c/train_num
   if epoch % display_size == 0:
      print("Epoch:{0},Loss:{1}".format(epoch+1,avg_loss))
print("Training Finished")

My model gets Epoch:2,Loss:0.0 Epoch:3,Loss:0.0 Epoch:4,Loss:0.0 Epoch:5,Loss:0.0 Epoch:6,Loss:0.0 Epoch:7,Loss:0.0 Epoch:8,Loss:0.0 Epoch:9,Loss:0.0 Epoch:10,Loss:0.0 Training Finished

How can I deal with this problem?

Upvotes: 5

Views: 5610

Answers (1)

gdelab
gdelab

Reputation: 6220

softmax_cross_entropy_with_logits expects labels in one-hot form, i.e. with a shape [batch_size, num_classes] . Here, you have y_columns = 1, which means only 1 class, which is necessarily always both the predicted one and the 'ground truth' (from your network's point of view), so your output is always correct no matter what the weights are. Hence, loss=0.

I guess you do have different classes, and y_train contains the ID of the label. Then predictions should be of shape [batch_size, num_classes], and instead of softmax_cross_entropy_with_logits you should use tf.nn.sparse_softmax_cross_entropy_with_logits

Upvotes: 7

Related Questions