Pytorch: Why loss functions are implemented both in nn.modules.loss and nn.functional module?

Question

Many loss functions in Pytorch are implemented both in nn.modules.loss and nn.functional.

For example, the two lines of the below return same results.

import torch.nn as nn
import torch.functional as F
nn.L1Loss()(x,y)
F.l1_loss(x,y)

Why are there two implementations?

Consistency for other parametric loss functions
Instantiation of loss function brings something good
otherwise

Jatentaki · Accepted Answer

I think of it as of a partial application situation - it's useful to be able to "bundle" many of the configuration variables with the loss function object. In most cases, your loss function has to take prediction and ground_truth as its arguments. This makes for a fairly uniform basic API of loss functions. However, they differ in details. For instance, not every loss function has a reduction parameter. BCEWithLogitsLoss has weight and pos_weight parameters; PoissonNLLLoss has log_input, eps. It's handy to write a function like

def one_epoch(model, dataset, loss_fn, optimizer):
    for x, y in dataset:
        model.zero_grad()
        y_pred = model(x)
        loss = loss_fn(y_pred, y)
        loss.backward()
        optimizer.step()

which can work with instantiated BCEWithLogitsLoss equally well as with PoissonNLLLoss. But it cannot work with their functional counterparts, because of the bookkeeping necessary. You would instead have to first create

loss_fn_packed = functools.partial(F.binary_cross_entropy_with_logits, weight=my_weight, reduction='sum')

and only then you can use it with one_epoch defined above. But this packing is already provided with the object-oriented loss API, along with some bells and whistles (since losses subclass nn.Module, you can use forward and backward hooks, move stuff between cpu and gpu, etc).

Pytorch: Why loss functions are implemented both in nn.modules.loss and nn.functional module?

Answers (2)

Related Questions