Kendall Weihe
Kendall Weihe

Reputation: 2075

Tensorflow accuracy at .99 but predictions awful

Maybe I'm making predictions wrong?

Here's the project... I have a greyscale input image that I am trying to segment. The segmentation is a simple binary classification (think of foreground vs background). So the ground truth (y) is a matrix of 0's and 1's -- so there's 2 classifications. Oh and the input image is a square, so I just use one variable called n_input

My accuracy essentially converges to 0.99 but when I make a prediction I get all zero's. EDIT --> there is a single 1 in each output matrices, both in the same place...

Here's my session code(everything else is working)...

with tf.Session() as sess:
    sess.run(init)
    summary = tf.train.SummaryWriter('/tmp/logdir/', sess.graph_def)
    step = 1
    from tensorflow.contrib.learn.python.learn.datasets.scroll import scroll_data
    data = scroll_data.read_data('/home/kendall/Desktop/')
    # Keep training until reach max iterations
    flag = 0
    # while flag == 0:
    while step * batch_size < training_iters:
        batch_y, batch_x = data.train.next_batch(batch_size)
        # pdb.set_trace()
        # batch_x = batch_x.reshape((batch_size, n_input))
        batch_x = batch_x.reshape((batch_size, n_input, n_input))
        batch_y = batch_y.reshape((batch_size, n_input, n_input))
        batch_y = convert_to_2_channel(batch_y, batch_size)
        # batch_y = batch_y.reshape((batch_size, n_output, n_classes))
        batch_y = batch_y.reshape((batch_size, 200, 200, n_classes))
        sess.run(optimizer, feed_dict={x: batch_x, y: batch_y,
                                       keep_prob: dropout})
        if step % display_step == 0:
            flag = 1
            # Calculate batch loss and accuracy
            loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x,
                                                              y: batch_y,
                                                              keep_prob: 1.})
            print "Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
                  "{:.6f}".format(loss) + ", Training Accuracy= " + \
                  "{:.5f}".format(acc)
        step += 1
    print "Optimization Finished!"
    save_path = "model.ckpt"
    saver.save(sess, save_path)

    im = Image.open('/home/kendall/Desktop/HA900_frames/frame0635.tif')
    batch_x = np.array(im)
    pdb.set_trace()
    batch_x = batch_x.reshape((1, n_input, n_input))
    batch_x = batch_x.astype(float)
    # pdb.set_trace()
    prediction = sess.run(pred, feed_dict={x: batch_x, keep_prob: 1.})
    print prediction
    arr1 = np.empty((n_input,n_input))
    arr2 = np.empty((n_input,n_input))
    for i in xrange(n_input):
        for j in xrange(n_input):
            for k in xrange(2):
                if k == 0:
                    arr1[i][j] = prediction[0][i][j][k]
                else:
                    arr2[i][j] = prediction[0][i][j][k]
    # prediction = np.asarray(prediction)
    # prediction = np.reshape(prediction, (200,200))
    # np.savetxt("prediction.csv", prediction, delimiter=",")
    np.savetxt("prediction1.csv", arr1, delimiter=",")
    np.savetxt("prediction2.csv", arr2, delimiter=",")

Since there are two classifications, that end part (with the couple of loops) is just to partition the prediction into two 2x2 matrices.

I saved the prediction arrays to a CSV file, and like I said, they were all zeros.

I have also confirmed all data is correct (dimensions and values).

Why would the training converge, but predictions are awful?

If you want to look at all the code, here it is...

import tensorflow as tf
import pdb
import numpy as np
from numpy import genfromtxt
from PIL import Image

# Import MINST data
# from tensorflow.examples.tutorials.mnist import input_data
# mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)


# Parameters
learning_rate = 0.001
training_iters = 20000
batch_size = 128
display_step = 1

# Network Parameters
n_input = 200 # MNIST data input (img shape: 28*28)
n_output = 40000 # MNIST total classes (0-9 digits)
n_classes = 2
#n_input = 200

dropout = 0.75 # Dropout, probability to keep units

# tf Graph input
x = tf.placeholder(tf.float32, [None, n_input, n_input])
y = tf.placeholder(tf.float32, [None, n_input, n_input, n_classes])
keep_prob = tf.placeholder(tf.float32) #dropout (keep probability)

# Create some wrappers for simplicity
def conv2d(x, W, b, strides=1):
    # Conv2D wrapper, with bias and relu activation
    x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
    x = tf.nn.bias_add(x, b)
    return tf.nn.relu(x)

def maxpool2d(x, k=2):
    # MaxPool2D wrapper
    return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1],
                          padding='SAME')


# Create model
def conv_net(x, weights, biases, dropout):
    # Reshape input picture
    x = tf.reshape(x, shape=[-1, n_input, n_input, 1])

    # Convolution Layer
    conv1 = conv2d(x, weights['wc1'], biases['bc1'])
    # Max Pooling (down-sampling)
    conv1 = maxpool2d(conv1, k=2)
    conv1 = tf.nn.local_response_normalization(conv1)

    # Convolution Layer
    conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
    # Max Pooling (down-sampling)
    conv2 = tf.nn.local_response_normalization(conv2)
    conv2 = maxpool2d(conv2, k=2)

    # Convolution Layer
    conv3 = conv2d(conv2, weights['wc3'], biases['bc3'])
    # Max Pooling (down-sampling)
    conv3 = tf.nn.local_response_normalization(conv3)
    conv3 = maxpool2d(conv3, k=2)

    # pdb.set_trace()

    # Fully connected layer
    # Reshape conv2 output to fit fully connected layer input
    fc1 = tf.reshape(conv3, [-1, weights['wd1'].get_shape().as_list()[0]])
    fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])
    fc1 = tf.nn.relu(fc1)
    # Apply Dropout
    fc1 = tf.nn.dropout(fc1, dropout)

    output = []
    for i in xrange(2):
        output.append(tf.nn.softmax(tf.add(tf.matmul(fc1, weights['out']), biases['out'])))

    return output
    # return tf.nn.softmax(tf.add(tf.matmul(fc1, weights['out']), biases['out']))


# Store layers weight & bias
weights = {
    # 5x5 conv, 1 input, 32 outputs
    'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])),
    # 5x5 conv, 32 inputs, 64 outputs
    'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])),
    # 5x5 conv, 32 inputs, 64 outputs
    'wc3': tf.Variable(tf.random_normal([5, 5, 64, 128])),
    # fully connected, 7*7*64 inputs, 1024 outputs
    'wd1': tf.Variable(tf.random_normal([25*25*128, 1024])),
    # 1024 inputs, 10 outputs (class prediction)
    'out': tf.Variable(tf.random_normal([1024, n_output]))
}

biases = {
    'bc1': tf.Variable(tf.random_normal([32])),
    'bc2': tf.Variable(tf.random_normal([64])),
    'bc3': tf.Variable(tf.random_normal([128])),
    'bd1': tf.Variable(tf.random_normal([1024])),
    'out': tf.Variable(tf.random_normal([n_output]))
}

# Construct model
pred = conv_net(x, weights, biases, keep_prob)
# pdb.set_trace()
pred = tf.pack(tf.transpose(pred,[1,2,0]))
pred = tf.reshape(pred, [-1,n_input,n_input,n_classes])
# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(pred, y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf.initialize_all_variables()
saver = tf.train.Saver()

def convert_to_2_channel(x, batch_size):
    #assume input has dimension (batch_size,x,y)
    #output will have dimension (batch_size,x,y,2)
    output = np.empty((batch_size, 200, 200, 2))

    temp_arr1 = np.empty((batch_size, 200, 200))
    temp_arr2 = np.empty((batch_size, 200, 200))

    for i in xrange(batch_size):
        for j in xrange(200):
            for k in xrange(200):
                if x[i][j][k] == 1:
                    temp_arr1[i][j][k] = 1
                    temp_arr2[i][j][k] = 0
                else:
                    temp_arr1[i][j][k] = 0
                    temp_arr2[i][j][k] = 1

    for i in xrange(batch_size):
        for j in xrange(200):
            for k in xrange(200):
                for l in xrange(2):
                    if l == 0:
                        output[i][j][k][l] = temp_arr1[i][j][k]
                    else:
                        output[i][j][k][l] = temp_arr2[i][j][k]

    return output

# Launch the graph
with tf.Session() as sess:
    sess.run(init)
    summary = tf.train.SummaryWriter('/tmp/logdir/', sess.graph_def)
    step = 1
    from tensorflow.contrib.learn.python.learn.datasets.scroll import scroll_data
    data = scroll_data.read_data('/home/kendall/Desktop/')
    # Keep training until reach max iterations
    flag = 0
    # while flag == 0:
    while step * batch_size < training_iters:
        batch_y, batch_x = data.train.next_batch(batch_size)
        # pdb.set_trace()
        # batch_x = batch_x.reshape((batch_size, n_input))
        batch_x = batch_x.reshape((batch_size, n_input, n_input))
        batch_y = batch_y.reshape((batch_size, n_input, n_input))
        batch_y = convert_to_2_channel(batch_y, batch_size)
        # batch_y = batch_y.reshape((batch_size, n_output, n_classes))
        batch_y = batch_y.reshape((batch_size, 200, 200, n_classes))
        sess.run(optimizer, feed_dict={x: batch_x, y: batch_y,
                                       keep_prob: dropout})
        if step % display_step == 0:
            flag = 1
            # Calculate batch loss and accuracy
            loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x,
                                                              y: batch_y,
                                                              keep_prob: 1.})
            print "Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
                  "{:.6f}".format(loss) + ", Training Accuracy= " + \
                  "{:.5f}".format(acc)
        step += 1
    print "Optimization Finished!"
    save_path = "model.ckpt"
    saver.save(sess, save_path)

    im = Image.open('/home/kendall/Desktop/HA900_frames/frame0635.tif')
    batch_x = np.array(im)
    pdb.set_trace()
    batch_x = batch_x.reshape((1, n_input, n_input))
    batch_x = batch_x.astype(float)
    # pdb.set_trace()
    prediction = sess.run(pred, feed_dict={x: batch_x, keep_prob: 1.})
    print prediction
    arr1 = np.empty((n_input,n_input))
    arr2 = np.empty((n_input,n_input))
    for i in xrange(n_input):
        for j in xrange(n_input):
            for k in xrange(2):
                if k == 0:
                    arr1[i][j] = prediction[0][i][j][k]
                else:
                    arr2[i][j] = prediction[0][i][j][k]
    # prediction = np.asarray(prediction)
    # prediction = np.reshape(prediction, (200,200))
    # np.savetxt("prediction.csv", prediction, delimiter=",")
    np.savetxt("prediction1.csv", arr1, delimiter=",")
    np.savetxt("prediction2.csv", arr2, delimiter=",")

    # Calculate accuracy for 256 mnist test images
    print "Testing Accuracy:", \
        sess.run(accuracy, feed_dict={x: data.test.images[:256],
                                      y: data.test.labels[:256],
                                      keep_prob: 1.})

Upvotes: 3

Views: 5576

Answers (1)

Olivier Moindrot
Olivier Moindrot

Reputation: 28208

Errors in the code

There are multiple errors in your code:

WARNING: This op expects unscaled logits, since it performs a softmax on logits internally for efficiency. Do not call this op with the output of softmax, as it will produce incorrect results.

  • in fact since you have 2 classes, you should use a loss with softmax, using tf.nn.softmax_cross_entropy_with_logits

  • When using tf.argmax(pred, 1), you only apply argmax over axis 1, which is the height of the output image. You should use tf.argmax(pred, 3) on the last axis (of size 2).

    • This might explain why you get 0.99 accuracy
    • On the output image, it will take the argmax over the height of the image, which is by default 0 (as all values are equal for each channel)

Wrong model

The biggest drawback is that your model in general will be very hard to optimize.

  • You have a softmax over 40,000 classes, which is huge.
  • You do not take advantage at all of the fact that you want to output an image (the prediction foreground / background).
    • for instance prediction 2,345 is highly correlated with prediction 2,346 and prediction 2,545 but you don't take that into account

I recommend reading a bit about semantic segmentation first:

  • this paper: Fully Convolutional Networks for Semantic Segmentation
  • these slides from CS231n (Stanford): especially the part about upsampling and deconvolution

Recommendations

If you want to work with TensorFlow, you will need to start small. First try a very simple network with maybe 1 hidden layer.

You need to plot all the shapes of your tensors to make sure they correspond to what you thought. For instance, if you had plotted tf.argmax(y, 1), you would have realized the shape is [batch_size, 200, 2] instead of the expected [batch_size, 200, 200].

TensorBoard is your friend, you should try to plot the input image here, as well as your predictions to see what they look like.

Try small, with a very small dataset of 10 images and see if you can overfit it and predict almost the exact response.


To conclude, I am not sure of all my suggestions but they are worth trying, and I hope this will help you on the path to success !

Upvotes: 8

Related Questions