nkhuyu
nkhuyu

Reputation: 860

Why pytorch has two kinds of Non-linear activations?

Why pytorch has two kinds of Non-linear activations?

enter image description here

Non-liner activations (weighted sum, nonlinearity): https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity

Non-linear activations (other): https://pytorch.org/docs/stable/nn.html#non-linear-activations-other

Upvotes: 1

Views: 1146

Answers (1)

kmario23
kmario23

Reputation: 61475

The primary difference is that the functions listed under Non-linear activations (weighted sum, nonlinearity) perform only thresholding and do not normalize the output. (i.e. the resultant tensor need not necessarily sum up to 1, either on the whole or along some specified axes/dimensions)

Example non-linearities:

nn.ReLU
nn.Sigmoid
nn.SELU
nn.Tanh


Whereas the non-linearities listed under Non-linear activations (other) perform thresholding and normalization (i.e. the resultant tensor sums up to 1, either for the whole tensor if no axis/dimension is specified; Or along the specified axes/dimensions)

Example non-linearities: (note the normalization term in the denominator)

softmax softmin

However, with the exception of nn.LogSoftmax() for which the resultant tensor doesn't sum up to 1 since we apply log over the softmax output.

Upvotes: 1

Related Questions