Reputation: 860
Why pytorch has two kinds of Non-linear activations?
Non-liner activations (weighted sum, nonlinearity): https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity
Non-linear activations (other): https://pytorch.org/docs/stable/nn.html#non-linear-activations-other
Upvotes: 1
Views: 1146
Reputation: 61475
The primary difference is that the functions listed under Non-linear activations (weighted sum, nonlinearity) perform only thresholding and do not normalize the output. (i.e. the resultant tensor need not necessarily sum up to 1, either on the whole or along some specified axes/dim
ensions)
Example non-linearities:
nn.ReLU
nn.Sigmoid
nn.SELU
nn.Tanh
Whereas the non-linearities listed under Non-linear activations (other) perform thresholding and normalization (i.e. the resultant tensor sums up to 1, either for the whole tensor if no axis/dim
ension is specified; Or along the specified axes/dim
ensions)
Example non-linearities: (note the normalization term in the denominator)
However, with the exception of nn.LogSoftmax()
for which the resultant tensor doesn't sum up to 1 since we apply log over the softmax output.
Upvotes: 1