madmantel
madmantel

Reputation: 19

Multi Layer Perceptron Deep Learning in Python using Pytorch

I am having errors in executing the train function of my code in MLP.

This is the error:

mat1 and mat2 shapes cannot be multiplied (128x10 and 48x10)

My code for the train function is this:

class net(nn.Module):
def __init__(self, input_dim2, hidden_dim2, output_dim2):
    super(net, self).__init__()
    self.input_dim2 = input_dim2
    self.fc1 = nn.Linear(input_dim2, hidden_dim2)
    self.relu = nn.ReLU()
    self.fc2 = nn.Linear(hidden_dim2, hidden_dim2)
    self.fc3 = nn.Linear(hidden_dim2, output_dim2) 
def forward(self, x):
  x = self.fc1(x)
  x = self.relu(x)
  x = self.fc2(x)
  x = self.relu(x)
  x = self.fc3(x) 
  x = F.softmax(self.fc3(x)) 
  
  return x



model = net(input_dim2, hidden_dim2, output_dim2) #create the network
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.RMSprop(model.parameters(), lr = learning_rate2)


def train(num_epochs2):
for i in range(num_epochs2):
  tmp_loss = []
  for (x,y) in train_loader:
    print(y.shape)
    print(x.shape)
    outputs = model(x) #forward pass
    print(outputs.shape)
    loss = criterion(outputs, y) #loss computation
    tmp_loss.append(loss.item()) #recording the loss
    optimizer.zero_grad() #all the accumulated gradient
    loss.backward()  #auto-differentiaton - accumulation of gradient
    optimizer.step() # a gradient step

  print("Loss at {}th epoch: {}".format(i, np.mean(tmp_loss)))  

I don't know where I'm wrong. My code seems to work okay.

Upvotes: 0

Views: 1412

Answers (2)

Tengerye
Tengerye

Reputation: 1964

From the limited message, I guess the place you are wrong are the following snippets:

x = self.fc3(x) 
x = F.softmax(self.fc3(x))

Try to replace with:

x = self.fc3(x) 
x = F.softmax(x)

A good question should include: error backtrace information and complete toy example which could repeat the errors!

Upvotes: 2

avio8
avio8

Reputation: 82

Here an relu activation seems to be missing in the 'init' function. Or there is an extra relu activation in the forward function. Look at the code below and try to figure out what is extra or missing.

def __init__(self, input_dim2, hidden_dim2, output_dim2):
    super(net, self).__init__()
    self.input_dim2 = input_dim2
    self.fc1 = nn.Linear(input_dim2, hidden_dim2)
    self.relu = nn.ReLU()
    self.fc2 = nn.Linear(hidden_dim2, hidden_dim2)
    self.fc3 = nn.Linear(hidden_dim2, output_dim2) 
def forward(self, x):
    x = self.fc1(x)
    x = self.relu(x)
    x = self.fc2(x)
    x = self.relu(x)
    x = self.fc3(x) 
    x = F.softmax(self.fc3(x)) 

return x

Upvotes: 0

Related Questions