K. Paris
K. Paris

Reputation: 61

Tensorflow always predict the same output

So, I'm trying to learn tensorflow and, for that, I try to create a classifier for something that, I think, is not so hard. I'd like to predict if a number is odd or even. The problem is that Tensorflow always predict the same output, I searched answers the last days but nothing helped me... I saw the following answers : -Tensorflow predicts always the same result

-TensorFlow always converging to same output for all items after training

-TensorFlow always return same result

Here's my code:

in:

df
    nb  y1
0   1   0
1   2   1
2   3   0
3   4   1
4   5   0
...
19  20  1

inputX = df.loc[:, ['nb']].as_matrix()
inputY = df.loc[:, ['y1']].as_matrix()
print(inputX.shape)
print(inputY.shape)

out:

(20, 1) (20, 1)

in:

# Parameters
learning_rate = 0.00000001
training_epochs = 2000
display_step = 50
n_samples = inputY.size


x = tf.placeholder(tf.float32, [None, 1])   
W = tf.Variable(tf.zeros([1, 1]))           
b = tf.Variable(tf.zeros([1]))            
y_values = tf.add(tf.matmul(x, W), b)      
y = tf.nn.relu(y_values)                 
y_ = tf.placeholder(tf.float32, [None,1])  

# Cost function: Mean squared error
cost = tf.reduce_sum(tf.pow(y_ - y, 2))/(2*n_samples)
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initialize variabls and tensorflow session
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

for i in range(training_epochs):  
    sess.run(optimizer, feed_dict={x: inputX, y_: inputY}) # Take a gradient descent step using our inputs and labels

    # Display logs per epoch step
    if (i) % display_step == 0:
        cc = sess.run(cost, feed_dict={x: inputX, y_:inputY})
        print("Training step:", '%04d' % (i), "cost=", "{:.9f}".format(cc)) #, \"W=", sess.run(W), "b=", sess.run(b)

print ("Optimization Finished!")
training_cost = sess.run(cost, feed_dict={x: inputX, y_: inputY})
print ("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

out:

Training step: 0000 cost= 0.250000000
Training step: 0050 cost= 0.250000000
Training step: 0100 cost= 0.250000000
...
Training step: 1800 cost= 0.250000000
Training step: 1850 cost= 0.250000000
Training step: 1900 cost= 0.250000000
Training step: 1950 cost= 0.250000000
Optimization Finished!
Training cost= 0.25 W= [[ 0.]] b= [ 0.]

in:

sess.run(y, feed_dict={x: inputX })

out:

array([[ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.]], dtype=float32)

I tried to play with my Hyper parameters like, the learning rate or the number of training epochs. I changed the activation function from softmax to relu. I changed my dataframe to have more examples but nothing happened. I also tried to add random for my Weights, but nothing changed, the cost was just starting to a higher value.

Upvotes: 3

Views: 3609

Answers (3)

János
János

Reputation: 153

first of all I have to admit that I never used tensorflow. But I think you have a modelling problem here.

You are using the simplest network architecture possible (a 1-dimensional perceptron). You have two variables (w and b) which you want to learn and your decision rule for the output looks like

decision formula of the pereceptron

if you subtract the b and divide by w you get

rearraged decision rule

So you are basically looking for a threshold to seperate odd and even numbers. No matter how you choose w and b you will always misclassify half of the numbers.

Although decinding if a number is odd or even seems to be a super trivial task for us humans it is not for a single perceptron.

Upvotes: 2

2006pmach
2006pmach

Reputation: 371

The main problem that I see is that you initialize your weights in the W matrix with 0s. The operation that you have in the linear layer is basically Wx + b. Hence the gradient with respect to x is W. If you start now with zeros for W then the gradient is 0 as well and you are not able to learn anything. Try to use random initial values as stated on tensorflow.org

# Create two variables.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
                      name="weights")
biases = tf.Variable(tf.zeros([200]), name="biases")

Upvotes: 3

Pietro Tortella
Pietro Tortella

Reputation: 1114

From giving a quick look at the code, it looks ok to me (maybe a part initializing the weights to zero, usually you want a small number different from zero to avoid a trivial solution), while I don't think that you can fit the problem of the parity of integers with a linear regression.

The point is that you are trying to fit

x % 2

with predictions of the form

activation(x * w + b)

and there is no way to find good w and b to solve this problem.

Another way to understand this is to plot your data: the scatter plot of the parity of x are two lines of points, and the only way to fit them with a line is with a flat line (that will have a high cost anyway).

I think it would be better to change data to start with, but if you want to address this problem, you should obtain some result using a sine or a cosine as activation function.

Upvotes: 3

Related Questions