Reputation: 499
I am working on an image recognition neural network with Pytorch. My goal is to take pictures of handwritten math equations, process them, and use the neural network to recognize each element. I've reached the point where I am able to separate every variable, number, or symbol from the equation, and everything is ready to be sent through the neural network. I've trained my network to recognize numbers quite well (this part was quite easy), but now I want to expand the scope of the neural network to recognizing letters as well as numbers. I loaded handwritten letters along with the numbers into tensors, shuffled the elements, and put them into batches. No matter how I vary my learning rate, my architecture (hidden layers and the number of neurons per layer), or my batch size I cannot get the neural network to recognize letters.
Here is my network architecture and the feed-forward function (you can see I experimented with the number of hidden layers):
class NeuralNetwork(nn.Module):
def __init__(self):
super().__init__()
inputNeurons, hiddenNeurons, outputNeurons = 784, 700, 36
# Create tensors for the weights
self.layerOne = nn.Linear(inputNeurons, hiddenNeurons)
self.layerTwo = nn.Linear(hiddenNeurons, hiddenNeurons)
self.layerThree = nn.Linear(hiddenNeurons, outputNeurons)
#self.layerFour = nn.Linear(hiddenNeurons, outputNeurons)
#self.layerFive = nn.Linear(hiddenNeurons, outputNeurons)
# Create function for Forward propagation
def Forward(self, input):
# Begin Forward propagation
input = torch.sigmoid(self.layerOne(torch.sigmoid(input)))
input = torch.sigmoid(self.layerTwo(input))
input = torch.sigmoid(self.layerThree(input))
#input = torch.sigmoid(self.layerFour(input))
#input = torch.sigmoid(self.layerFive(input))
return input
And this is the training code block (the data is shuffled in a dataloader, the ground truths are shuffled in the same order, batch size is 10, total number letter and number data points is 244800):
neuralNet = NeuralNetwork()
params = list(neuralNet.parameters())
criterion = nn.MSELoss()
print(neuralNet)
dataSet = next(iter(imageDataLoader))
groundTruth = next(iter(groundTruthsDataLoader))
for i in range(15):
for k in range(24480):
neuralNet.zero_grad()
prediction = neuralNet.Forward(dataSet)
loss = criterion(prediction, groundTruth)
loss.backward()
for layer in range(len(params)):
# Updating the weights of the neural network
params[layer].data.sub_(params[layer].grad.data * learningRate)
Thanks for the help in advance!
Upvotes: 1
Views: 69
Reputation: 6618
First thing i would recommend is writing a clean Pytorch code
For eg.
if i see your NeuralNetwork class it should have forward
method (f in lower case),
so that you wont call it using prediction = neuralNet.Forward(dataSet)
. Reason being your hooks from neural network does not get dispatched if you use prediction = neuralNet.Forward(dataSet)
.
For more details refer this link
Second thing is : Since your dataset is not balance.....try to use undersampling / oversampling methods, which will be very helpful in your case.
Upvotes: 1