What is happening with torch.Tensor.add_?

Question

I'm looking at this implementation of SGD for PyTorch: https://pytorch.org/docs/stable/_modules/torch/optim/sgd.html#SGD

And I see some strange calculations which I don't understand.

For instance, take a look at p.data.add_(-group['lr'], d_p). It makes sense to think that there is a multiplication of the two parameters, right? (It's how SGD works, -lr * grads)

But the documentation of the function doesn't say anything about this.

And what is more confusing, although this SGD code actually works (I tested by copying the code and calling prints below the add_), I can't simply use add_ with two arguments as it does:

#this returns an error about using too many arguments 
import torch

a = torch.tensor([1,2,3])
b = torch.tensor([6,10,15])
c = torch.tensor([100,100,100])
a.add_(b, c)
print(a)

What's going on here? What am I missing?

Alexey Golyshev · Accepted Answer

This works for scalars:

a = t.tensor(1)
b = t.tensor(2)
c = t.tensor(3)
a.add_(b, c)
print(a)

tensor(7)

Or a can be a tensor:

a = t.tensor([[1,1],[1,1]])
b = t.tensor(2)
c = t.tensor(3)
a.add_(b, c)
print(a)

tensor([[7, 7], [7, 7]])

Output is 7, because: (Tensor other, Number alpha)

What is happening with torch.Tensor.add_?

Answers (1)

Related Questions