Reputation: 21
I just wonder could the softmax provided by the TensorFlow package, namely, tensorflow.nn.softmax, be substituted by one implemented by myself?
I run the original tutorial file mnist_softmax.py with cross_entropy calculation:
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.nn.softmax(y)), reduction_indices=[1]))
it will give a cross validate accuracy rate 0.9195, it quite makes sense.
However, I do some changes, like below:
# Create the model
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.matmul(x, W) + b
# The below two lines are added by me, an equivalent way to calculate softmax, at least in my opinion
y1 = tf.reduce_sum(y)
y2 = tf.scalar_mul(1.0 / y1, y)
# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None, 10])
...
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y2), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
However, the cross validate accuracy rate is only 0.098.
Anyone has any insight into this problem? Thanks a lot.
Upvotes: 0
Views: 890
Reputation: 12795
Your y2
as a matter of fact is not equivalent to computing softmax. Softmax is
softmax(y) = e^y / S
Where S
is a normalizing factor (the sum of e^y
across all y
s). Also, when you compute the normalizing factor, you only need to reduce the sum over the classes, not over the samples. More proper way would be
y1 = tf.reduce_sum(tf.exp(y), reduction_indices=[1])
y2 = tf.scalar_mul(1.0 / y1, tf.exp(y))
Upvotes: 1