Reputation: 1288
I'm currently working on the handwritten digits recognition problem.
To start with, I have tested sample handwritten digits against MNIST dataset.
I'm getting an accuracy of 53% and I need above 90% accuracy.
Following are the things that I have tried so far to increase the accuracy.
Created my own dataset
I have created 41,000 examples. To get started I made a small dataset, which has 10000 examples( 1000 for each digits ).
The dataset was created on the lines of mnist format( probably thought of clubbing my dataset and mnist dataset in the later stage ). The accuracy of the model built on this was close to 65%.
Approach
So my question is:
Is there another approach/algorithm, which would be able to detect the digits more accurately.
Do I need to train the model more ?
Do I need to clean the images ?
I am working on clubbing the mnist dataset and my dataset(41,000 digits data) to see, if it increases the accuracy.
Code
To test my images against mnist(Run the mnist before this code)
You can find the Ipyhton notebooks for:
Testing my sample digits against MNIST ( Script - 1 )
Testing my sample digits against my dataset( Script - 2 )
The scripts and the images are available at this link
Upvotes: 1
Views: 2165
Reputation: 11
You can try to monochrome the picture (as in Mnist each pixel value is between 0 to 255) and use a test of "is pixel i > 0". This increased our algorithm to 80%. Also, you can try to divide the picture to frames (try 4 or 8). Moreover, you can build tests based on lines, curves and etc..
you can take a look at my implementation which gave me up to 91%: https://github.com/orlevy08/Data-Analysis
Upvotes: 0
Reputation: 11968
First a few notes:
Assuming you do all these and you don't see improvements, it might imply there's an issue with your data. Check out a confusion matrix to see where the model is having trouble. Look at some examples that are missclassified. From my experience I've seen 1s and 7s in the dataset that are almost indisinguishable. This is not exactly a solution, but should point you in the right direction on what you need to fix.
Upvotes: 3