Pavan elisetty
Pavan elisetty

Reputation: 322

what is the difference between using softmax as a sequential layer in tf.keras and softmax as an activation function for a dense layer?

what is the difference between using softmax as a sequential layer in tf.keras and softmax as an activation function for a dense layer?

tf.keras.layers.Dense(10, activation=tf.nn.softmax)

and

tf.keras.layers.Softmax(10)

Upvotes: 8

Views: 1801

Answers (1)

Marco Cerliani
Marco Cerliani

Reputation: 22031

they are the same, you can test it on your own

# generate data
x = np.random.uniform(0,1, (5,20)).astype('float32')

# 1st option
X = Dense(10, activation=tf.nn.softmax)
A = X(x)

# 2nd option
w,b = X.get_weights()
B = Softmax()(tf.matmul(x,w) + b)

tf.reduce_all(A == B)
# <tf.Tensor: shape=(), dtype=bool, numpy=True>

Pay attention also when using tf.keras.layers.Softmax, it doesn't require to specify the units, it's a simple activation

by default, the softmax is computed on the -1 axis, you can change this if you have tensor outputs > 2D and want to operate softmax on other dimensionalities. You can change this easily in the second option

Upvotes: 6

Related Questions