Reputation: 10992
I had a really weird time with TensorFlow the last days and can not think of what's going wrong atm.
I have built this network: link. It is a copy of TensorFlow's MNIST example.
Basically, what I did, is altering the network from taking 28x28x1 images (MNIST greyscale) to 600x800x1 images (some images I took on my own, webcam with a relatively clean background and one object at different locations).
What I wanted to do is playing around with such a CNN and have it output the x-location of the object in the image. So one single output neuron.
However, no matter what I tried, the network is always outputting 1.0. Even (when you look at my testing section at the end of the code) when I feed all ones, all zeros or all random numbers into the network.
Of course, since I have only 21 labeled training and 7 labeled test pictures I expected the performance to be really bad (since 800x600 pixel images are huge for neural networks and locating an object isn't easy).
...but I have no idea at all why the network always outputs 1.0 even if it is fed with nonsense. Any ideas?
Upvotes: 5
Views: 4541
Reputation: 126154
Looking at your source code, it appears that your final fully connected layer before the softmax (L.104 in the code you shared) reduces each example down to a single output class before computing the softmax. Since there is only one class for each example, the result of the tf.nn.softmax()
op (y_conv
) will be a batch_size x 1 matrix containing 1.0 in every element, and the tf.argmax()
of that will containing 0 for every element, since there is only one value. Similarly, applying tf.argmax()
to y_train (which is a batch_size x 1 matrix) will yield 0 for every element, so the "accuracy" will be 100%.
To fix this, you should probably (i) increase the number of output units from the final fully connected layer to be the number of classes, and (ii) encode each row y_train
as a one-hot vector representing the true class of each example.
Upvotes: 6