Liam
Liam

Reputation: 669

Tensorflow first layer neuron's weights don't change

Is it ok if my first layer neuron's weights don't change ?

I'm on the MNIST network on Tensorflow and I've tried to get the neuron's weights like this in the "inference" function:

def inference(images, hidden1_units, hidden2_units):

    weights = []

    # Hidden 1
    with tf.name_scope('hidden1'):
        weights.append(tf.Variable( tf.truncated_normal([IMAGE_PIXELS, hidden1_units], stddev=1.0 / math.sqrt(float(IMAGE_PIXELS)))))
        biases = tf.Variable(tf.zeros([hidden1_units]))
        hidden1 = tf.nn.relu(tf.matmul(images, weights[0]) + biases)

    # Hidden 2
    with tf.name_scope('hidden2'):
        weights.append(tf.Variable(tf.truncated_normal([hidden1_units, hidden2_units],stddev=1.0 / math.sqrt(float(hidden1_units)))))
        biases = tf.Variable(tf.zeros([hidden2_units]))
        hidden2 = tf.nn.relu(tf.matmul(hidden1, weights[1]) + biases)

    # Linear
    with tf.name_scope('softmax_linear'):
        weights.append(tf.Variable(tf.truncated_normal([hidden2_units, NUM_CLASSES],stddev=1.0 / math.sqrt(float(hidden2_units)))))
        biases = tf.Variable(tf.zeros([NUM_CLASSES]))
        logits = tf.matmul(hidden2, weights[2]) + biases
    return weights, logits

I create an array which where I put the weight's arrays.

I print my array like this :

print_weights(sess.run(poids))

where print_weights is

def print_weights(poids):
    for i in range(len(poids)):
        print('--  + str(i) + ' --')
        print(poids[i])

Until here, all is fine. But I display the weights at the beginning and at the end and the first layer neuron's weights havn't changed.

BEGINNING

-- 0 --

[[ 0.03137168  0.03483023]
 [ 0.01353009  0.00035462]
 [ 0.02957422 -0.01347954]
 ..., 
 [-0.04083598  0.02377481]
 [-0.05120984  0.00143244]
 [-0.01799158 -0.02219945]]

-- 1 --

[[ 0.68714064]
 [ 0.30847442]]

-- 2 --

[[ 0.87441564  0.09957008 -0.58042473  1.34084558 -0.46372819 -0.19947429
  -1.46314788 -0.59285629  0.72775543 -0.69785988]]


END

-- 0 --

[[ 0.03137168  0.03483023]
 [ 0.01353009  0.00035462]
 [ 0.02957422 -0.01347954]
 ..., 
 [-0.04083598  0.02377481]
 [-0.05120984  0.00143244]
 [-0.01799158 -0.02219945]]

-- 1 --

[[-1.16852498]
 [-0.27643263]]

-- 2 --

[[ 0.98213464  0.12448452 -0.36638314  0.47689819 -0.42525211 -0.13292283
  -1.29118276 -0.49366322  0.74673325 -0.57575113]]

As you can see, the seconds and the thirds weight's array change, but not the firsts and I don't know why ... Someone could help me please ? Thanks !

Upvotes: 1

Views: 825

Answers (1)

rdadolf
rdadolf

Reputation: 1248

I wrapped your code up in a training harness and ran it without issue.

I think the problem here is not your code but the interpretation of the results. Numpy summarizes large arrays in the way you've shown, by displaying the first couple and last couple elements. (The elements of your poids list are np.array's.)

What you're seeing is that the first couple and last couple weight elements aren't changing, but your conclusion is that the entire matrix isn't changing—but it is!

Try using this as a summary method instead (print the mean and standard deviation instead of just a few elements):

def print_weights(poids):
  for i in range(len(poids)):
    print('-- ' + str(i) + ' --')
    print(np.mean(poids[i]),np.std(poids[i]))

Upvotes: 4

Related Questions