Reputation: 809
I created a Conv1D model for text classification.
When using softmax / sigmoid at the last dense, It yields result as
softmax => [0.98502016 0.0149798 ]
sigmoid => [0.03902826 0.00037046]
I just want the sigmoid result's first index should be at least greater than 0.8. Just want the multi-classes should have independent results. How do I achieve this?
Model summary:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 128, 100) 600
_________________________________________________________________
conv1d (Conv1D) (None, 126, 128) 38528
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 63, 128) 0
_________________________________________________________________
conv1d_1 (Conv1D) (None, 61, 128) 49280
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 30, 128) 0
_________________________________________________________________
conv1d_2 (Conv1D) (None, 28, 128) 49280
_________________________________________________________________
max_pooling1d_2 (MaxPooling1 (None, 14, 128) 0
_________________________________________________________________
flatten (Flatten) (None, 1792) 0
_________________________________________________________________
dense (Dense) (None, 2) 3586
=================================================================
Total params: 141,274
Trainable params: 141,274
Non-trainable params: 0
_________________________________________________________________
model.add(keras.layers.Dense(num_class, activation='sigmoid'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop', metrics=['acc'])
Upvotes: 2
Views: 1227
Reputation: 8595
I agree with comment by @blue-phoenox that you shouldn't use sigmoid with cross-entropy because the sum of probabilities of classes does not equal one. But if you have reasons for using sigmoid
, you can normalize your output by the sum of the vector elements to make it equal to 1:
output = output/tf.reshape(tf.reduce_sum(output, 1), (-1, 1))
And you'll get:
import tensorflow as tf
output = tf.Variable([[0.03902826, 0.00037046]])
output = output/tf.reshape(tf.reduce_sum(output, 1), (-1, 1))
summedup = tf.reduce_sum(output, axis=1)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(output.eval()) # [[0.9905971 0.00940284]] - new output
print(summedup.eval()) # [1.] - summs up to 1
To implement it in keras
you can create a subclass of tf.keras.layers.Layer
like this:
from tensorflow.keras import layers
class NormLayer(layers.Layer):
def __init__(self):
super(NormLayer, self).__init__()
def call(self, inputs):
return inputs / tf.reshape(tf.reduce_sum(inputs, 1), (-1, 1))
And then use it within your Sequential()
model:
# using dummy data to illustrate
x_train = np.array([[-1.551, -1.469], [1.022, 1.664]], dtype=np.float32)
y_train = np.array([[0, 1], [1, 0]], dtype=np.int32)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=2, activation=tf.nn.sigmoid, input_shape=(2, )))
model.add(NormLayer())
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(x=x_train,
y=y_train,
epochs=2,
batch_size=2)
# ...
Upvotes: 2
Reputation: 420
Sigmoid produces output between 0 and 1. If you are using the same loss function for both softmax and sigmoid then it won't work. Try binary_crossentropy instead. And if you have more than 2 classes I don't think sigmoid is what you are looking for.
Upvotes: 1