Reputation: 6033
sess = tf.InteractiveSession()
num_elements = 10
output = [[0.76158798] * num_elements]
softmax_w = [[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]] * num_elements
print(tf.matmul(output, softmax_w).eval())
gives
[[ 0.76158804 0.76158804 0.76158804 0.76158804 0.76158804 0.76158804 0.76158804]]
Changing num_elements to 50
sess = tf.InteractiveSession()
num_elements = 50
output = [[0.76158798] * num_elements]
softmax_w = [[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]] * num_elements
print(tf.matmul(output, softmax_w).eval())
gives
[[ 3.80794024 3.80794024 3.80794024 3.80794024 3.80793881 3.80793881 3.80793881]]
Why are the elements in the result matrix not all the same for the second example?
I'm using tensor flow 0.11.0rc0
Upvotes: 2
Views: 415
Reputation: 44
I believe this is caused by the fact that your row length (7) is not an integer multiple of the number of floats that fit in an SSE register. For your larger example, the first 4 elements of the output are computed using a vectorized code path, while the last 3 are computed in a scalar "cleanup" loop. The order of the floating point additions performed in the vectorized and scalar versions of the code may differ slightly, and since floating point addition is not associative, slight differences on the order of
num_elements * std::numeric_limits<float>::epsilon() * std::abs(result)
occur.
Upvotes: 1
Reputation: 3159
Seems it was caused by numerical errors. I got the same results with your code, but then I made output
and softmax_w
the float64 tensors and the problem has disappeared:
sess = tf.InteractiveSession()
num_elements = 50
output = tf.convert_to_tensor([[0.76158798] * num_elements], dtype = tf.float64)
softmax_w = tf.convert_to_tensor([[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]] * num_elements, dtype = tf.float64)
print(tf.matmul(output, softmax_w).eval())
Upvotes: 1