Ricardo Achilles
Ricardo Achilles

Reputation: 31

TensorFlow: parameters do not update when training

I'm implementing a classification model using TensorFlow

The problem that I'm facing is that my weights and error are not being updated when I run the training step. As a result, my network keeps returning the same results.

I've developed my model based on the MNIST example from the TensorFlow website.

import numpy as np
import tensorflow as tf
sess = tf.InteractiveSession()

#load dataset
dataset = np.loadtxt('char8k.txt', dtype='float', comments='#', delimiter=",")
Y = np.asmatrix( dataset[:,0] ) 
X = np.asmatrix( dataset[:,1:1201] )

m = 11527
labels = 26

# y is update to 11527x26
Yt = np.zeros((m,labels))

for i in range(0,m):
    index = Y[0,i] - 1
    Yt[i,index]= 1

Y = Yt
Y = np.asmatrix(Y)

#------------------------------------------------------------------------------

#graph settings

x = tf.placeholder(tf.float32, shape=[None, 1200])
y_ = tf.placeholder(tf.float32, shape=[None, 26])


Wtest = tf.Variable(tf.truncated_normal([1200,26], stddev=0.001))
W = tf.Variable(tf.truncated_normal([1200,26], stddev=0.001))
b = tf.Variable(tf.zeros([26]))
sess.run(tf.initialize_all_variables())

y = tf.nn.softmax(tf.matmul(x,W) + b)

cross_entropy = -tf.reduce_sum(y_*tf.log(y))

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
Wtest = W


for i in range(10):
  print("iteracao:")
  print(i)
  Xbatch = X[np.random.randint(X.shape[0],size=100),:]
  Ybatch = Y[np.random.randint(Y.shape[0],size=100),:]
  train_step.run(feed_dict={x: Xbatch, y_: Ybatch})
  print("atualizacao de pesos")  
  print(Wtest==W)#monitora atualizaçao dos pesos

  correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
  accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
  print("precisao:Y")
  print accuracy.eval(feed_dict={x: X, y_: Y})
  print(" ")
  print(" ")

Upvotes: 2

Views: 5195

Answers (1)

mrry
mrry

Reputation: 126154

The issue probably arises from how you initialize the weight matrix, W. If it is initialized to all zeroes, all of the neurons will follow the same gradient in each step, which leads to the network not training. Replacing the line

W = tf.Variable(tf.zeros([1200,26]))

...with something like

W = tf.Variable(tf.truncated_normal([1200,26], stddev=0.001))

...should cause it to start training.

This question on the CrossValidated site has a good explanation of why you should not initialize all of your weights to zero.

Upvotes: 5

Related Questions