Reputation: 151
I am new to Pytorch and I am going through this tutorial to figure out how to do deep learning with this library. I have problem figuring out part of the code.
There is a class called Net and an object called model instantiated from it. Then there is the training function called train(epoch). In the next line in the train function body I see this: model.train() which I can't make any sense of. Would you please help me in understanding this part of the code? how we are calling a method of a class when this method has not been defined inside the class? and how come the method has the exact same name as function it has been called inside? This is the class definition:class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 32, kernel_size=3)
self.conv2 = nn.Conv2d(32, 64, kernel_size=3)
self.conv2_drop = nn.Dropout2d()
self.fc1 = nn.Linear(2304, 256)
self.fc2 = nn.Linear(256, 17)
def forward(self, x):
x = F.relu(F.max_pool2d(self.conv1(x), 2))
x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
x = x.view(x.size(0), -1) # Flatten layer
x = F.relu(self.fc1(x))
x = F.dropout(x, training=self.training)
x = self.fc2(x)
return F.sigmoid(x)
model = Net() # On CPU
# model = Net().cuda() # On GPU
And this is the function defined after this class:
def train(epoch):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
# data, target = data.cuda(async=True), target.cuda(async=True) # On GPU
data, target = Variable(data), Variable(target)
optimizer.zero_grad()
output = model(data)
loss = F.binary_cross_entropy(output, target)
loss.backward()
optimizer.step()
if batch_idx % 10 == 0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.data[0]))
Upvotes: 0
Views: 962
Reputation: 1518
The def train(epochs): ...
is the method to train the model and is not an attribute of Net
class. model
is an object of Net
class, that inherits from nn.Module
.
In PyTorch, all layers inherit from nn.Module
and that gives them a lot of common functionality like model.children()
or layer.children()
, model.apply()
, etc.
The model.train()
is similarly implemented in nn.Module
. It doesn't actually train the model or any layer that inherits from nn.Module
but sets it in train mode, so say it's equivalent to doing model.set_train()
.
There is an equivalent method model.eval()
you would call in to set the evaluation mode before testing the model.
There can be some parameters in a layer/model that are suppose to act differently while training and evaluation modes. The obvious example are BatchNorm
's γ and β
Upvotes: 2