Reputation: 138357
I'm trying out PyTorch for the first time, and running into a couple problems. I've shared some of my code below and have two questions.
Q1: What should my output size be? Each input should lead to one output, equal to one of 6 possible output labels (1-6). Should output size be 1 or 6?
Q2: Something is wrong with my accuracy calculation, which I think is tied into Q1. predicted
ends up being of size 4 x 4411
, where 4 is my batch size (and so I think that's correct) but 4411 is my feature/input size, which is almost certainly wrong. I would expect it to be 6 (number of possible output labels). labels
is 4x6, which I think is correct. If I change the dim I'm taking the max over from 1 to 2, then it gets the correct size of 4x6, but logically it makes no sense as it's returning the max index across all feature values for an input.
I think I'm missing something crucial about what pytorch is doing with my data. I feel so close.... Any ideas on how I can fix this? Thanks!
def __init__(self, input_size, output_size, hidden_dim, n_layers):
super(Net, self).__init__()
self.input_size = input_size
self.output_size = output_size
self.hidden_dim = hidden_dim
self.n_layers = n_layers
self.rnn = torch.nn.RNN(input_size, hidden_dim, n_layers, batch_first=True)
# Q1: What should my output_size be?
self.fc = torch.nn.Linear(hidden_dim, output_size)
def forward(self, x):
batch_size = x.size(0)
hidden = self.init_hidden(batch_size)
out, hidden = self.rnn(x, hidden)
out = self.fc(out)
return out, hidden
def init_hidden(self, batch_size):
return torch.zeros(self.n_layers, batch_size, self.hidden_dim)
if __name__ == '__main__':
# ... code removed that just creates the Dataloaders, and initialises some size variables
net = Net(input_size=dataset.input_size, output_size=6, hidden_dim=24, n_layers=1)
net.to(device)
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=lr)
for epoch in range(1, epochs+1):
total = 0
correct = 0
for inputs, labels in dataloaders['train']:
optimizer.zero_grad()
inputs, labels = inputs.to(device), labels.to(device)
output, hidden = net(inputs)
loss = criterion(output, labels) # error
loss.backward()
optimizer.step()
# Q2: Something is wrong here
_, predicted = torch.max(output.data, 1)
total += inputs.size(0)
correct += (predicted == labels).sum().item()
print(predicted == labels)
accuracy = 100 * correct / total```
Upvotes: 1
Views: 972
Reputation: 24231
CrossEntropyLoss
expects output
of size (n)
(with the score of each of the n classes) and label
as an integer of the correct class's index.[batch, input_size, 6]
, since it is an rnn and producing a sequence of the same length as the input (with 6 scores per element of the sequence). If you wish to have one class label prediction per sequence (not per element of sequence) you will need to collapse this to a tensor of shape [batch, 6]
.Upvotes: 1