Mahsa
Mahsa

Reputation: 591

training a RNN in Pytorch

I want to have an RNN model and teach it to learn generating "ihello" from "hihell". I am new in Pytorch and following the instruction in a video to write the code. I have written two python files named train.py and model.py. this is model.py:

#----------------- model for teach rnn hihell to ihello
#-----------------  OUR MODEL ---------------------
import torch
import torch.nn as nn
from torch import autograd

class Model(nn.Module):
    def __init__(self):
        super(Model,self).__init__()
        self.rnn=nn.RNN(input_size=input_size,hidden_size=hidden_size,batch_first=True)
    def forward(self,x,hidden):
        #Reshape input in (batch_size,sequence_length,input_size)
        x=x.view(batch_size,sequence_length,input_size)
        #Propagate input through RNN
        #Input:(batch,seq+len,input_size)
        out,hidden=self.rnn(x,hidden)
        out=out.view(-1,num_classes)
        return hidden,out
    def init_hidden(self):
        #Initialize hidden and cell states
        #(num_layers*num_directions,batch,hidden_size)
        return autograd.Variable(torch.zeros(num_layers,batch_size,hidden_size))

and this is train.py:

"""----------------------train for teach rnn to hihell to ihello--------------------------"""
#-----------------  DATA PREPARATION ---------------------
#Import
import torch
import torch.nn as nn
from torch import autograd
from model import Model
import sys


idx2char=['h','i','e','l','o']
#Teach hihell->ihello
x_data=[0,1,0,2,3,3]#hihell
y_data=[1,0,2,3,3,4]#ihello
one_hot_lookup=[[1,0,0,0,0],#0
                [0,1,0,0,0],#1
                [0,0,1,0,0],#2
                [0,0,0,1,0],#3
                [0,0,0,0,1]]#4
x_one_hot=[one_hot_lookup[x] for x in x_data]
inputs=autograd.Variable(torch.Tensor(x_one_hot))
labels=autograd.Variable(torch.LongTensor(y_data))
""" ----------- Parameters Initialization------------"""
num_classes = 5
input_size = 5  # one hot size
hidden_size = 5  # output from LSTM to directly predict onr-hot
batch_size = 1  # one sequence
sequence_length = 1  # let's do one by one
num_layers = 1  # one layer RNN
"""-----------------  LOSS AND TRAINING ---------------------"""
#Instantiate RNN model
model=Model()
#Set loss and optimizer function
#CrossEntropyLoss=LogSoftmax+NLLLOSS
criterion=torch.nn.CrossEntropyLoss()
optimizer=torch.optim.Adam(model.parameters(),lr=0.1)

"""----------------Train the model-------------------"""
for epoch in range(100):
    optimizer.zero_grad()
    loss=0
    hidden=model.init_hidden()
    sys.stdout.write("Predicted String:")
    for input,label in zip(inputs,labels):
        #print(input.size(),label.size())
        hidden,output=model(input,hidden)
        val,idx=output.max(1)
        sys.stdout.write(idx2char[idx.data[0]])
        loss+=criterion(output,label)
    print(",epoch:%d,loss:%1.3f"%(epoch+1,loss.data[0]))
    loss.backward()
    optimizer.step()

when I run train.py, I receive this error:

self.rnn=nn.RNN(input_size=input_size,hidden_size=hidden_size,batch_first=True) NameError: name 'input_size' is not defined

I don't know why I receive this error because I have input_size=5 in the above lines of my code. could anybody help me? thanks.

Upvotes: 0

Views: 1494

Answers (1)

Aechlys
Aechlys

Reputation: 1306

The scope of the variables defined in train.py (num_classes, input_size, ...) is the train.py itself. They are only visible in this file. The model.py is oblivious to these. I suggest including these arguments in the constructor:

class Model(nn.Module):
  def __init__(self, hidden_size, input_size):
    # same

and then call the Model as:

model = Model(hidden_size, input_size)

Similarly, for other variables that you defined in train.py (and want to use them in model.py) you have to pass them as arguments to either their respective functions, or to the constructor and store them as attributes.

Upvotes: 2

Related Questions