Iinferno1
Iinferno1

Reputation: 49

Tensorflow converge to mean

I am trying to predict a binary output using tensorflow. The training data has roughly 69% zeros for the output. The input features are real valued, and I normalized them by subtracting the mean and dividing by the standard deviation. Every time I run the network, no matter what techniques I've tried, I cannot get a model >69% accurate, and it looks like my Yhat is converging to all zeros.

I've tried a lot of things like different optimizers, loss functions, batch sizes, etc.. but no matter what I do it converges to 69% and never goes over. I'm guessing there's a more fundemental problem with what I'm doing but I can't seem to find it.

Here is the latest version of my code

    X = tf.placeholder(tf.float32,shape=[None,14],name='X')
    Y = tf.placeholder(tf.float32,shape=[None,1],name='Y')

    W1 = tf.Variable(tf.truncated_normal(shape=[14,20],stddev=0.5))
    b1 = tf.Variable(tf.zeros([20]))
    l1 = tf.nn.relu(tf.matmul(X,W1) + b1)

    l1 = tf.nn.dropout(l1,0.5)

    W2 = tf.Variable(tf.truncated_normal(shape=[20,20],stddev=0.5))
    b2 = tf.Variable(tf.zeros([20]))
    l2 = tf.nn.relu(tf.matmul(l1,W2) + b2)

    l2 = tf.nn.dropout(l2,0.5)

    W3 = tf.Variable(tf.truncated_normal(shape=[20,15],stddev=0.5))
    b3 = tf.Variable(tf.zeros([15]))
    l3 = tf.nn.relu(tf.matmul(l2,W3) + b3)

    l3 = tf.nn.dropout(l3,0.5)

    W5 = tf.Variable(tf.truncated_normal(shape=[15,1],stddev=0.5))
    b5 = tf.Variable(tf.zeros([1]))
    Yhat = tf.matmul(l3,W5) + b5

    loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=Yhat, labels=Y))

    learning_rate = 0.005
    l2_weight = 0.001
    learner = tf.train.AdamOptimizer(learning_rate).minimize(loss)

    correct_prediction = tf.equal(tf.greater(Y,0.5), tf.greater(Yhat,0.5))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Upvotes: 2

Views: 429

Answers (2)

Nathan Zhang
Nathan Zhang

Reputation: 16

When you calculate your correct_prediction

    correct_prediction = tf.equal(tf.greater(Y,0.5), tf.greater(Yhat,0.5))

It seems that Yhat is still logits, you're supposed to calculate a Y_pred using sigmoid, and use the Y_pred to calculate your correct_prediction

    Y_pred = tf.nn.sigmoid(Yhat)
    correct_prediction = tf.equal(tf.greater(Y,0.5), tf.greater(Y_pred,0.5))

Upvotes: 0

Himaprasoon
Himaprasoon

Reputation: 2659

You are using a constant dropout.

l3 = tf.nn.dropout(l3,0.5)

Dropout should be used only while training and not while checking accuracy or during prediction.

keep_prob = tf.placeholder(tf.float32)
l3 = tf.nn.dropout(l3,keep_prob)

The placeholder should be given appropriate value during training and 1 while testing/prediction.

You have dropouts at every layer, I am not sure if you need that many dropouts for a small network. Hope this helps

Upvotes: 0

Related Questions