Reputation: 3
I'm new to Tensorflow and I went through the beginner tutorial about MNIST (https://www.tensorflow.org/versions/r0.9/tutorials/mnist/beginners/index.html). It works fine with the given data, but it fails when I try to use my own. Here's an example of implementation where I'm trying to make the network learn addition (i,j) -> (i+j):
def generateData(n, r=100) :
inputData = []
outputData = []
for k in range(n):
i, j = random.randrange(0, r), random.randrange(0, r)
inputData.append([i, j])
outputData.append([i + j])
return inputData, outputData
x = tf.placeholder(tf.float32, [None, 2])
W = tf.Variable(tf.zeros([2, 1]))
b = tf.Variable(tf.zeros([1]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 1])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
for i in range(10):
batch_xs, batch_ys = generateData(10)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
batch_xs, batch_ys = generateData(10)
print(sess.run(accuracy, feed_dict={x: batch_xs, y_: batch_ys}))
feed_dict = {x: batch_xs}
classification = sess.run(y, feed_dict)
print(classification)
As a result, I get 1.0 for accuracy and a vector of [1.0] for classification. An accuracy of 100% is possible since the model is really simple, but the prediction is clearly not. In fact, I get the exact same results if I replace the generated output data i+j by a random number. In such a case, there is no way I could have a 1.0 of accuracy. It is as if the network did not learn anything. Where is the problem ?
Upvotes: 0
Views: 110
Reputation: 1502
You are effectively trying to do linear regression, but you are using cross entropy loss, which is geared towards classification*. Try using a different loss function, e.g. squared_loss = tf.reduce_mean(tf.squared_distance(y, y_))
.
*Also, you are using the cross entropy loss in a wrong way. If you want to do classification with cross-entropy loss, you need to have one output neuron per class of your classification problem. In your code snippet above, you have a single output neuron (y
is of shape [None, 1]
), but 200 hypothetical classes if y
is the sum of two integers in the range from 0 to 100. Just to be clear, this particular problem should not be treated as a classification problem. I just wanted to point out this error in the code you provided.
Upvotes: 1