redmage123
redmage123

Reputation: 533

Can't get simple tensorflow logistic regression program to work with sigmoid activation function.

I've got a very simple logistic regression tensorflow program that looks like this:

 #!/usr/bin/env python3
import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import model_selection
import sys

gender_df = pd.read_csv('data/binary_data.csv')

# Shuffle our data
gender_df = gender_df.sample(frac=1).reset_index(drop=True)

print (gender_df.columns)

train_x,test_x, train_y, test_y = model_selection.train_test_split(gender_df['HEIGHT'],gender_df['GENDER'],test_size = 0.3)

tmp = np.asarray(train_x)
tmp.resize([train_x.shape[0],1])
train_x = tmp

tmp = np.asarray(train_y)
tmp = np.resize(tmp,[train_y.shape[0],2])
train_y = tmp

tmp = np.asarray(test_x)
tmp.resize([test_x.shape[0],1])
test_x = tmp

tmp = np.asarray(test_y)
tmp = np.resize(tmp,[test_y.shape[0],2])
test_y = tmp

n_samples = train_x.shape[0]

x = tf.placeholder(tf.float32, [None,1])
y = tf.placeholder(tf.float32,[None,2])

W = tf.Variable(tf.zeros([1,2]),dtype = tf.float32)
b = tf.Variable(tf.zeros([1]),dtype = tf.float32)

a = tf.nn.sigmoid((W * x) + b)

learning_rate = 0.001

cross_entropy = tf.reduce_mean(-(y*tf.log(a) + (1 - y) * tf.log(1-a)))
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    for epoch in range(1000):
        _,l = sess.run([train_step, cross_entropy], feed_dict = {x: train_x, y:train_y})
        if epoch % 50 == 0:
            print ('loss = %f' %(l))

    correct = tf.equal(tf.argmax(a,1), tf.argmax(y,1))
    accuracy = tf.reduce_mean(tf.cast(correct,'float'))
    print ('Accuracy: ', accuracy.eval({x: test_x, y:test_y}))

It's a fairly simple binary classification logistic regression program that takes 100 lines of sample data that have two columns, Gender has a value either of 0 (female) or 1 (male). Height is in centimeters.

I'm trying to do a prediction of gender based on height but the loss value doesn't seem to converge to a minimum, additionally, the cost values and the accuracy vary wildly from one run to the next, even though the data being looked at is the same data.

I can have a run where the accuracy is 0.8 and the next run the accuracy is 0.2

Also, i noticed for some reason that the first loss value is always: loss = 0.693147 But, for example the rest of loss calculations can look like this: loss = 1.397364

loss = 1.397516

loss = 1.397514

loss = 1.397515

loss = 1.397514 ...

I'm rather confused about what's happening.
Am I using the right sigmoid function? From my understanding, I only need to use softmax when I have a logistic regression problem with multiple classes and for a simple binary classification, I can use tf.sigmoid(). Also, do I need to add in the 'b' parameter to the sigmoid function? Should I set it to random values rather than zeros?

Also, can someone suggest a simple binary classification problem example using logistic regression and tensorflow that doesn't use the mnist or iris databases?

Any help appreciated.

Thanks

Upvotes: 0

Views: 407

Answers (2)

Mohan Radhakrishnan
Mohan Radhakrishnan

Reputation: 3197

This is again from the perspective of another TensorFlow straggler but I managed to execute this. I couldn't proffer a decent explanation though.

Code is now using a 'one hot' representation and accuracy is 0.84. I just used another dataset but I don't know how many features are optimal. I believe it should work with the original dataset too but I had already switched the datasets when it worked. Features are numerical.

import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import model_selection
from sklearn import preprocessing


data = pd.read_csv('D:/Development_Avector/PycharmProjects/TensorFlow/anes_dataset.csv')
training_features = ['TVnews', 'PID', 'age', 'educ', 'income']
target = 'vote'

print (data.columns)

X = data.loc[:, ['TVnews', 'PID', 'age', 'educ', 'income']]
# y is our labels
y = data.loc[:, ['vote']]

oneHot = preprocessing.OneHotEncoder()
oneHot.fit(X)
X = oneHot.transform(X).toarray()
oneHot.fit(y)
y = oneHot.transform(y).toarray()
train_x,test_x, train_y, test_y = model_selection.train_test_split(X, y, test_size = 0.1, random_state=0)


n_samples = train_x.shape[0]
print ("There are " + str( n_samples) + " samples")
print ("Shape of train_x is " + str(train_x.shape))
print ("Shape of train_y is " + str(train_y.shape))
print (train_y)

x = tf.placeholder(tf.float32, [None,train_x.shape[1]])
y = tf.placeholder(tf.float32,[None,2])
print ("Shape of y is " + str(y.shape))


W = tf.Variable(np.zeros((train_x.shape[1], 2)),tf.float32,name="W")
b = tf.Variable(0.,dtype = tf.float32)


predicted_y1 = tf.add(tf.matmul(x,tf.cast(W,tf.float32) ), b)
print (predicted_y1.shape)
predicted_y = tf.nn.softmax(predicted_y1)
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = predicted_y, labels = y))


optimizer = tf.train.AdamOptimizer(0.0003).minimize(cross_entropy)

s = tf.InteractiveSession()
s.run(tf.global_variables_initializer())

for i in range(40000):
    _, loss_i = s.run([optimizer,cross_entropy], {x: train_x, y: train_y})
    print("loss at iter %i:%.4f" % (i, loss_i))

accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(predicted_y,1), tf.argmax(y,1)), "float"))
accuracy_value = s.run(accuracy, feed_dict={x:test_x, y:test_y})
print (accuracy_value)


loss at iter 0:0.6931
loss at iter 1:0.6929
loss at iter 2:0.6927
loss at iter 3:0.6924
-------------------
loss at iter 39997:0.3599
loss at iter 39998:0.3599
loss at iter 39999:0.3599
0.84210527

Upvotes: 0

Giuseppe Marra
Giuseppe Marra

Reputation: 1104

Both your x and y should be of shape [None, 1] and your W simply [1, 1]. Both your input and your output are mono-dimensional.

You could even drop the matrix notation and use simply vectors in this example.

Upvotes: 1

Related Questions