ZKS
ZKS

Reputation: 2826

Tensor Operation and gradient

I was going some tutorials on youtube where below code sample was used to explain derivatives

Create tensors.

    x = torch.tensor(3.)

    w = torch.tensor(4., requires_grad=True)

    b = torch.tensor(5., requires_grad=True)

    x, w, b

Arithmetic operations

    y = w * x + b

    y

Compute derivatives

    y.backward()

Display gradients

    print('dy/dx:', x.grad)

    print('dy/dw:', w.grad)

    print('dy/db:', b.grad)

OUTPUT

dy/dx: None

dy/dw: tensor(3.)

dy/db: tensor(1.)

Could anyone please explain me how we are getting tensor(3.) & tensor(1.) as an output of gradient. I need to understand how pytorch is performing this operation behind the scene.

Any help would be appreciated.

Upvotes: 1

Views: 395

Answers (1)

Harshit Kumar
Harshit Kumar

Reputation: 12837

You have y = w*x + b, then

dy/dx = w
dy/dw = x
dy/db = 1

Since you've not set requires_grad=True for x, PyTorch won't calculate derivative w.r.t. it.
Hence, dy/dx = None

Rest are the values of corresponding tensors. Thus, the final output is

dy/dx: None
dy/dw: tensor(3.)
dy/db: tensor(1.)

Upvotes: 2

Related Questions