Reputation: 357
I was assigned to write a simple network with nn.ModuleDict. So, here it is:
third_model = torch.nn.ModuleDict({
'flatten': torch.nn.Flatten(),
'fc1': torch.nn.Linear(32 * 32 * 3, 1024),
'relu': torch.nn.ReLU(),
'fc2': torch.nn.Linear(1024, 240),
'relu': torch.nn.ReLU(),
'fc3': torch.nn.Linear(240, 10)})
Then I tried to train it (with cuda):
third_model.to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(third_model.parameters(), lr=0.001, momentum=0.9)
train(third_model, criterion, optimizer, train_dataloader, test_dataloader)
Function "train(model, criterion, optimizer, train_dataloader, test_dataloader)" trains the model and visualizes loss and accuracy of the model. It works properly.
Train:
def train(model, criterion, optimizer, train_dataloader, test_dataloader):
train_loss_log = []
train_acc_log = []
val_loss_log = []
val_acc_log = []
for epoch in range(NUM_EPOCH):
model.train()
train_loss = 0.
train_size = 0
train_acc = 0.
for imgs, labels in train_dataloader:
imgs, labels = imgs.to(device), labels.to(device)
optimizer.zero_grad()
y_pred = model(imgs)
loss = criterion(y_pred, labels)
loss.backward()
optimizer.step()
train_loss += loss.item()
train_size += y_pred.size(0)
train_loss_log.append(loss.data / y_pred.size(0))
_, pred_classes = torch.max(y_pred, 1)
train_acc += (pred_classes == labels).sum().item()
train_acc_log.append(np.mean((pred_classes == labels).cpu().numpy()))
val_loss = 0.
val_size = 0
val_acc = 0.
model.eval()
with torch.no_grad():
for imgs, labels in test_dataloader:
imgs, labels = imgs.to(device), labels.to(device)
pred = model(imgs)
loss = criterion(pred, labels)
val_loss += loss.item()
val_size += pred.size(0)
_, pred_classes = torch.max(pred, 1)
val_acc += (pred_classes == labels).sum().item()
val_loss_log.append(val_loss / val_size)
val_acc_log.append(val_acc / val_size)
clear_output()
plot_history(train_loss_log, val_loss_log, 'loss')
plot_history(train_acc_log, val_acc_log, 'accuracy')
print('Train loss:', train_loss / train_size)
print('Train acc:', train_acc / train_size)
print('Val loss:', val_loss / val_size)
print('Val acc:', val_acc / val_size)
I've already trained models coded with nn.Sequential and everything is okay. However, with nn.ModuleDict I get an error:
TypeError Traceback (most recent call last)
<ipython-input-144-8b33ad3aad2c> in <module>()
2 optimizer = optim.SGD(third_model.parameters(), lr=0.001, momentum=0.9)
3
----> 4 train(third_model, criterion, optimizer, train_dataloader, test_dataloader)
1 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),
TypeError: forward() takes 1 positional argument but 2 were given
Tried to find any documentation on nn.ModuleDict, but it seems like there are no examples of coding networks with it.
It seems like the problem might be with linear layers, although I do not know why.
So, I hope anyone could explain where the mistake is. Would be very grateful for any possible advice.
Upvotes: 2
Views: 2699
Reputation: 114796
A nn.moduleDict
is a container and its forward
function is not defined. It should be used to store sub-modules/networks.
You should using nn.Sequential
initialized with as ordered dictionary, OrderedDict
:
third_model = torch.nn.Sequential(
OrderedDict([
('flatten', torch.nn.Flatten()),
('fc1', torch.nn.Linear(32 * 32 * 3, 1024)),
('relu', torch.nn.ReLU()),
('fc2', torch.nn.Linear(1024, 240)),
('relu', torch.nn.ReLU()),
('fc3', torch.nn.Linear(240, 10))]))
Upvotes: 5