AtariMae
AtariMae

Reputation: 3

fail to use custom data in beginner tutorial

I'm new to Tensorflow and I went through the beginner tutorial about MNIST (https://www.tensorflow.org/versions/r0.9/tutorials/mnist/beginners/index.html). It works fine with the given data, but it fails when I try to use my own. Here's an example of implementation where I'm trying to make the network learn addition (i,j) -> (i+j):

def generateData(n, r=100) :
    inputData = []
    outputData = []
    for k in range(n):
        i, j = random.randrange(0, r), random.randrange(0, r)
        inputData.append([i, j])
        outputData.append([i + j])
    return inputData, outputData

x = tf.placeholder(tf.float32, [None, 2])
W = tf.Variable(tf.zeros([2, 1]))
b = tf.Variable(tf.zeros([1]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 1])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
for i in range(10):
    batch_xs, batch_ys = generateData(10)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
batch_xs, batch_ys = generateData(10)
print(sess.run(accuracy, feed_dict={x: batch_xs, y_: batch_ys}))
feed_dict = {x: batch_xs}
classification = sess.run(y, feed_dict)
print(classification)

As a result, I get 1.0 for accuracy and a vector of [1.0] for classification. An accuracy of 100% is possible since the model is really simple, but the prediction is clearly not. In fact, I get the exact same results if I replace the generated output data i+j by a random number. In such a case, there is no way I could have a 1.0 of accuracy. It is as if the network did not learn anything. Where is the problem ?

Upvotes: 0

Views: 110

Answers (1)

lballes
lballes

Reputation: 1502

You are effectively trying to do linear regression, but you are using cross entropy loss, which is geared towards classification*. Try using a different loss function, e.g. squared_loss = tf.reduce_mean(tf.squared_distance(y, y_)).

*Also, you are using the cross entropy loss in a wrong way. If you want to do classification with cross-entropy loss, you need to have one output neuron per class of your classification problem. In your code snippet above, you have a single output neuron (y is of shape [None, 1]), but 200 hypothetical classes if y is the sum of two integers in the range from 0 to 100. Just to be clear, this particular problem should not be treated as a classification problem. I just wanted to point out this error in the code you provided.

Upvotes: 1

Related Questions