user2647513
user2647513

Reputation: 584

Tensorflow initialization gives all ones

tensorflow 1.12.0

In the code snipped below, it seems that wrapped_rv_val and seq_rv_val should be equivalent, but they are not. Instead, seq_rv_val is correctly initialized to the randomly generated init_val array, but wrapped_rv_val is set to all ones. What's going on here?

import numpy as np
import tensorflow as tf

init_val = np.random.rand(1, 1, 16, 1).astype(np.dtype('float32'))

wrapped_rv = tf.nn.softmax(tf.get_variable('wrapped_rv', initializer=init_val))

var = tf.get_variable('seq_rv', initializer=init_val)
seq_rv = tf.nn.softmax(var, axis=2)

init_op = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init_op)
    wrapped_rv_val = sess.run(wrapped_rv)
    seq_rv_val = sess.run(seq_rv)

    print("seq_rv_val: {0}".format(seq_rv_val.flatten()))
    print("wrapped_rv_val: {0}".format(wrapped_rv_val.flatten())) 

output:

seq_rv_val: [0.28422353 0.12556878 0.18170598 0.19684952 0.21165217]

wrapped_rv_val: [1. 1. 1. 1. 1.]

Upvotes: 3

Views: 98

Answers (1)

giser_yugang
giser_yugang

Reputation: 6166

In fact, seq_rv_val and wrapped_rv_val both will be correctly initialized to the randomly generated init_val array when you do the following.

# change
wrapped_rv = tf.nn.softmax(tf.get_variable('wrapped_rv', initializer=init_val))
# to
wrapped_rv = tf.nn.softmax(tf.get_variable('wrapped_rv', initializer=init_val), axis=2)

Next I'll explain why wrapped_rv is initialized to 1. Let's look at the formula of softmax. enter image description here

The number of denominator summation items will be 16 when you set axis=2. But the number of denominator summation items will be 1 when you set axis=-1(default). So the molecule is the same as the denominator and the result is 1 when you set it to axis=-1. You can run the following example to understand the problem.

import tensorflow as tf

y = tf.constant([[1],[0],[1]],dtype=tf.float32)
y1 = tf.constant([[1],[2],[3]],dtype=tf.float32)
y2 = tf.constant([[1],[3],[7]],dtype=tf.float32)

softmax_var1 = tf.nn.softmax(logits=y1)
softmax_var2 = tf.nn.softmax(logits=y2)

with tf.Session() as sess:
    print(sess.run(softmax_var1))
    print(sess.run(softmax_var2))

[[1.]
 [1.]
 [1.]]
[[1.]
 [1.]
 [1.]]

Upvotes: 1

Related Questions