Reputation: 3535
Many loss functions in Pytorch are implemented both in nn.modules.loss and nn.functional.
For example, the two lines of the below return same results.
import torch.nn as nn
import torch.functional as F
nn.L1Loss()(x,y)
F.l1_loss(x,y)
Why are there two implementations?
Upvotes: 4
Views: 3092
Reputation: 55
There is the code of BCEWithLogistsLoss without doc:
class BCEWithLogitsLoss(_Loss):
def __init__(self, weight: Optional[Tensor] = None, size_average=None, reduce=None, reduction: str = 'mean',
pos_weight: Optional[Tensor] = None) -> None:
super(BCEWithLogitsLoss, self).__init__(size_average, reduce, reduction)
self.register_buffer('weight', weight)
self.register_buffer('pos_weight', pos_weight)
def forward(self, input: Tensor, target: Tensor) -> Tensor:
return F.binary_cross_entropy_with_logits(input, target,
self.weight,
pos_weight=self.pos_weight,
reduction=self.reduction)
If parameter passing isn't considered, class and function implementation are same exactly. However, use class implementation can keep your code more concise and readable, e.g.
use function
loss_func=binary_cross_entropy_with_logits
def train(model, dataloader, loss_fn, optimizer, weight, size_average, reduce, reduction, pos_weight):
for x, y in dataloader:
model.zero_grad()
y_pred = model(x)
loss = loss_fn(y_pred, y, weight, size_average, reduce, reduction, pos_weight)
loss.backward()
optimizer.step()
use class
loss_func = BCEWithLogitsLoss(weight, size_average, reduce, reduction, pos_weight)
def train(model, dataloader, loss_fn, optimizer):
for x, y in dataloader:
model.zero_grad()
y_pred = model(x)
loss = loss_fn(y_pred, y)
loss.backward()
optimizer.step()
If you have several parameters or different loss functions, the class implementation is better.
Upvotes: 1
Reputation: 13103
I think of it as of a partial application situation - it's useful to be able to "bundle" many of the configuration variables with the loss function object. In most cases, your loss function has to take prediction
and ground_truth
as its arguments. This makes for a fairly uniform basic API of loss functions. However, they differ in details. For instance, not every loss function has a reduction
parameter. BCEWithLogitsLoss
has weight
and pos_weight
parameters; PoissonNLLLoss
has log_input
, eps
. It's handy to write a function like
def one_epoch(model, dataset, loss_fn, optimizer):
for x, y in dataset:
model.zero_grad()
y_pred = model(x)
loss = loss_fn(y_pred, y)
loss.backward()
optimizer.step()
which can work with instantiated BCEWithLogitsLoss
equally well as with PoissonNLLLoss
. But it cannot work with their functional counterparts, because of the bookkeeping necessary. You would instead have to first create
loss_fn_packed = functools.partial(F.binary_cross_entropy_with_logits, weight=my_weight, reduction='sum')
and only then you can use it with one_epoch
defined above. But this packing is already provided with the object-oriented loss API, along with some bells and whistles (since losses subclass nn.Module
, you can use forward and backward hooks, move stuff between cpu and gpu, etc).
Upvotes: 4