Reputation: 45
I’m a newbie learning Deep Learning, I’m stuck trying to understand what ‘.backward()’ from Pytorch does since it does pretty much most the work there. Therefor, I’m trying to understand what backward function does in detail so, I’m going to try to code what the function does step by step. Any resource that you can recommend me (Book, video, GitHub repo) to start coding the function? Thank you for time and hopefully for your help.
Upvotes: 3
Views: 630
Reputation: 46291
backward()
is calculating the gradients with respect to (w.r.t.) graph leaves.
grad()
function is more general it can calculate the gradients w.r.t. any inputs (leaves included).
I implemented the grad()
function, some time ago, you may check this. It uses the power of Automatic Differentiation (AD).
import math
class ADNumber:
def __init__(self,val, name=""):
self.name=name
self._val=val
self._children=[]
def __truediv__(self,other):
new = ADNumber(self._val / other._val, name=f"{self.name}/{other.name}")
self._children.append((1.0/other._val,new))
other._children.append((-self._val/other._val**2,new)) # first derivation of 1/x is -1/x^2
return new
def __mul__(self,other):
new = ADNumber(self._val*other._val, name=f"{self.name}*{other.name}")
self._children.append((other._val,new))
other._children.append((self._val,new))
return new
def __add__(self,other):
if isinstance(other, (int, float)):
other = ADNumber(other, str(other))
new = ADNumber(self._val+other._val, name=f"{self.name}+{other.name}")
self._children.append((1.0,new))
other._children.append((1.0,new))
return new
def __sub__(self,other):
new = ADNumber(self._val-other._val, name=f"{self.name}-{other.name}")
self._children.append((1.0,new))
other._children.append((-1.0,new))
return new
@staticmethod
def exp(self):
new = ADNumber(math.exp(self._val), name=f"exp({self.name})")
self._children.append((self._val,new))
return new
@staticmethod
def sin(self):
new = ADNumber(math.sin(self._val), name=f"sin({self.name})")
self._children.append((math.cos(self._val),new)) # first derivative is cos
return new
def grad(self,other):
if self==other:
return 1.0
else:
result=0.0
for child in other._children:
result+=child[0]*self.grad(child[1])
return result
A = ADNumber # shortcuts
sin = A.sin
exp = A.exp
def print_childs(f, wrt): # with respect to
for e in f._children:
print("child:", wrt, "->" , e[1].name, "grad: ", e[0])
print_child(e[1], e[1].name)
x1 = A(1.5, name="x1")
x2 = A(0.5, name="x2")
f=(sin(x2)+1)/(x2+exp(x1))+x1*x2
print_childs(x2,"x2")
print("\ncalculated gradient for the function f with respect to x2:", f.grad(x2))
Out:
child: x2 -> sin(x2) grad: 0.8775825618903728
child: sin(x2) -> sin(x2)+1 grad: 1.0
child: sin(x2)+1 -> sin(x2)+1/x2+exp(x1) grad: 0.20073512936690338
child: sin(x2)+1/x2+exp(x1) -> sin(x2)+1/x2+exp(x1)+x1*x2 grad: 1.0
child: x2 -> x2+exp(x1) grad: 1.0
child: x2+exp(x1) -> sin(x2)+1/x2+exp(x1) grad: -0.05961284871202578
child: sin(x2)+1/x2+exp(x1) -> sin(x2)+1/x2+exp(x1)+x1*x2 grad: 1.0
child: x2 -> x1*x2 grad: 1.5
child: x1*x2 -> sin(x2)+1/x2+exp(x1)+x1*x2 grad: 1.0
calculated gradient for the function f with respect to x2: 1.6165488003791766
Upvotes: 2