Brian Feeny
Brian Feeny

Reputation: 451

model.parameters() does not produce an iterable of Tensors

I am trying to use torch.nn.utils.clip_grad_norm_() which requires an iterable of Tensors. See below

for epoch in progress_bar(range(num_epochs)): 
    lstm.train()
    outputs = lstm(trainX.to(device))
    optimizer.zero_grad()
    torch.nn.utils.clip_grad_norm_(lstm.parameters(), 1)

My code errors with:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-168-4cd34e6fd44d> in <module>
     28     lstm.train()
     29     outputs = lstm(trainX.to(device))
---> 30     torch.nn.utils.clip_grad_norm_(lstm.parameters(), 1)
     31 
     32 

/opt/conda/lib/python3.6/site-packages/torch/nn/utils/clip_grad.py in clip_grad_norm_(parameters, max_norm, norm_type)
     28         total_norm = max(p.grad.detach().abs().max() for p in parameters)
     29     else:
---> 30         total_norm = torch.norm(torch.stack([torch.norm(p.grad.detach(), norm_type) for p in parameters]), norm_type)
     31     clip_coef = max_norm / (total_norm + 1e-6)
     32     if clip_coef < 1:

RuntimeError: stack expects a non-empty TensorList

If I example lstm.parameters() I get a list of Parameters, instead of a list of Tensors:

<class 'torch.nn.parameter.Parameter'> torch.Size([2048, 1])
<class 'torch.nn.parameter.Parameter'> torch.Size([2048, 512])
<class 'torch.nn.parameter.Parameter'> torch.Size([2048])
<class 'torch.nn.parameter.Parameter'> torch.Size([2048])
<class 'torch.nn.parameter.Parameter'> torch.Size([2048, 512])
<class 'torch.nn.parameter.Parameter'> torch.Size([2048, 512])
<class 'torch.nn.parameter.Parameter'> torch.Size([2048])
<class 'torch.nn.parameter.Parameter'> torch.Size([2048])
<class 'torch.nn.parameter.Parameter'> torch.Size([1, 512])
<class 'torch.nn.parameter.Parameter'> torch.Size([1])

Looking at the first Parameter, it is a list of Tensors:

<class 'torch.Tensor'> torch.Size([1])
<class 'torch.Tensor'> torch.Size([1])
<class 'torch.Tensor'> torch.Size([1])
<class 'torch.Tensor'> torch.Size([1])
<class 'torch.Tensor'> torch.Size([1])
<class 'torch.Tensor'> torch.Size([1])
.
.
.

Does anyone know what is going on here?

Upvotes: 0

Views: 1187

Answers (1)

Szymon Maszke
Szymon Maszke

Reputation: 24726

PyTorch's clip_grad_norm, as the name suggests, operates on gradients. You have to calculate your loss from output, use loss.backward() and perform gradient clipping afterwards.

Also, you should use optimizer.step() after this operation.

Something like this:

for epoch in progress_bar(range(num_epochs)): 
    lstm.train()
    for batch in dataloader:
        optimizer.zero_grad()
        outputs = lstm(trainX.to(device))
        loss = my_loss(outputs, targets)
        loss.backward()
        torch.nn.utils.clip_grad_norm_(lstm.parameters(), 1)
        optimizer.step()

You don't have parameter.grad calculated (it's value is None) and that's the reason of your error.

Upvotes: 1

Related Questions