Yurui Ming
Yurui Ming

Reputation: 21

Can substitute default softmax with self-implemented one in TensorFlow?

I just wonder could the softmax provided by the TensorFlow package, namely, tensorflow.nn.softmax, be substituted by one implemented by myself?

I run the original tutorial file mnist_softmax.py with cross_entropy calculation:

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.nn.softmax(y)), reduction_indices=[1]))

it will give a cross validate accuracy rate 0.9195, it quite makes sense.

However, I do some changes, like below:

# Create the model
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.matmul(x, W) + b

# The below two lines are added by me, an equivalent way to calculate softmax, at least in my opinion
y1 = tf.reduce_sum(y)
y2 = tf.scalar_mul(1.0 / y1, y)

# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None, 10])

...
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y2), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

However, the cross validate accuracy rate is only 0.098.

Anyone has any insight into this problem? Thanks a lot.

Upvotes: 0

Views: 890

Answers (1)

Ishamael
Ishamael

Reputation: 12795

Your y2 as a matter of fact is not equivalent to computing softmax. Softmax is

softmax(y) = e^y / S

Where S is a normalizing factor (the sum of e^y across all ys). Also, when you compute the normalizing factor, you only need to reduce the sum over the classes, not over the samples. More proper way would be

y1 = tf.reduce_sum(tf.exp(y), reduction_indices=[1])
y2 = tf.scalar_mul(1.0 / y1, tf.exp(y))

Upvotes: 1

Related Questions