Reputation: 973
hi im new to rnn's and I found RNN NLP FROM SCRATCH from pytorch official tutorials, and I think it's named "from scartch" because it didn't use the nn.RNN
built in nn
in pytorch some line like this self.rnn = nn.RNN(input_size, hidden_size, num_layers, batch_first=True)
in the def __init__(self, input_size, hidden_size, output_size):
segment. so how to the code would have been evolved if the nn.RNN
was been used?
class RNN(nn.Module):
# implement RNN from scratch rather than using nn.RNN
def __init__(self, input_size, hidden_size, output_size):
super(RNN, self).__init__()
self.hidden_size = hidden_size
self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
self.i2o = nn.Linear(input_size + hidden_size, output_size)
self.softmax = nn.LogSoftmax(dim=1)
def forward(self, input_tensor, hidden_tensor):
combined = torch.cat((input_tensor, hidden_tensor), 1)
hidden = self.i2h(combined)
output = self.i2o(combined)
output = self.softmax(output)
return output, hidden
def init_hidden(self):
return torch.zeros(1, self.hidden_size)
def train(line_tensor, category_tensor):
hidden = rnn.init_hidden()
for i in range(line_tensor.size()[0]):
output, hidden = rnn(line_tensor[i], hidden)
loss = criterion(output, category_tensor)
optimizer.zero_grad()
loss.backward()
optimizer.step()
return output, loss.item()
another equivalent to this question is how to rewrite the code with using self.rnn = nn.RNN(input_size, hidden_size, num_layers, batch_first=True)
or if it's not possible how internal nn.RNN
structure look like?
Upvotes: 1
Views: 384
Reputation: 732
This model is referring the implementation of RNN before autograde module introduce, it is a pure implementation of RNN
. In this example hidden state and gradient entirely handled by graph.
def init_hidden(self):
return torch.zeros(1, self.hidden_size)
the line above initializes the hidden state(which is zeros at first). and after first step we get the output
and next hidden state
which later feed in the next step.
All this process handle by graph
.
Upvotes: 1