rafafan2010
rafafan2010

Reputation: 1579

How does Pytorch's `autograd` handle non-mathematical functions?

During the course of my training process, I tend to use a lot of calls to torch.cat() and copying tensors into new tensors. How are these operations handled by autograd? Is the gradient value affected by these operations?

Upvotes: 1

Views: 214

Answers (1)

jodag
jodag

Reputation: 22184

As pointed out in the comments, cat is a mathematical function. For example we could write the following (special case) definition of cat in more traditional mathematical notation as

enter image description here

The Jacobian of this function w.r.t. either of its inputs can be expressed as

enter image description here

Since the Jacobian is well defined you can, of course, apply back-propagation.

In reality you generally wouldn't define these operations with such notation, and a general definition of the cat operation used by pytorch in such a way would be cumbersome.

That said, internally autograd uses backward algorithms that take into account the gradients of such "index style" operations just like any other function.

Upvotes: 1

Related Questions