Reputation: 2826
I was going some tutorials on youtube where below code sample was used to explain derivatives
x = torch.tensor(3.)
w = torch.tensor(4., requires_grad=True)
b = torch.tensor(5., requires_grad=True)
x, w, b
y = w * x + b
y
y.backward()
print('dy/dx:', x.grad)
print('dy/dw:', w.grad)
print('dy/db:', b.grad)
OUTPUT
dy/dx: None
dy/dw: tensor(3.)
dy/db: tensor(1.)
Could anyone please explain me how we are getting tensor(3.) & tensor(1.) as an output of gradient. I need to understand how pytorch is performing this operation behind the scene.
Any help would be appreciated.
Upvotes: 1
Views: 395
Reputation: 12837
You have y = w*x + b
, then
dy/dx = w
dy/dw = x
dy/db = 1
Since you've not set requires_grad=True
for x, PyTorch won't calculate derivative w.r.t. it.
Hence, dy/dx = None
Rest are the values of corresponding tensors. Thus, the final output is
dy/dx: None
dy/dw: tensor(3.)
dy/db: tensor(1.)
Upvotes: 2