Digit Recognition on CNN

I am testing printed digits (0-9) on a Convolutional Neural Network. It is giving 99+ % accuracy on the MNIST Dataset, but when I tried it using fonts installed on computer (Ariel, Calibri, Cambria, Cambria math, Times New Roman) and trained the images generated by fonts (104 images per font(Total 25 fonts - 4 images per font(little difference)) the training error rate does not go below 80%, i.e. 20% accuracy. Why?

Here is "2" number Images sample -

I resized every image 28 x 28.

Here is more detail :-

Training data size = 28 x 28 images. Network parameters - As LeNet5 Architecture of Network -

Input Layer -28x28
| Convolutional Layer - (Relu Activation);
| Pooling Layer - (Tanh Activation)
| Convolutional Layer - (Relu Activation)
| Local Layer(120 neurons) - (Relu)
| Fully Connected (Softmax Activation, 10 outputs)

This works, giving 99+% accuracy on MNIST. Why is so bad with computer-generated fonts? A CNN can handle lot of variance in data.

Upvotes: 7

Answers (3)

Martin Thoma

Reputation: 136297

I see two likely problems:

Preprocessing: MNIST is not only 28px x 28px, but also:

The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.

Source: MNIST website

Overfitting:

MNIST has 60,000 training examples and 10,000 test examples. How many do you have?
Did you try dropout (see paper)?
Did you try dataset augmentation techniques? (e.g. slightly shifting the image, probably changing the aspect ratio a bit, you could also add noise - however, I don't think those will help)
Did you try smaller networks? (And how big are your filters / how many filters do you have?)

Remarks

Interesting idea! Did you try simply applying the trained MNIST network on your data? What are the results?

Upvotes: 2

FiReTiTi

Reputation: 5888

It definitely looks like an issue of overfitting. I see that you have two convolution layers, two max pooling layers and two fully connected. But how many weights total? You only have 96 examples per class, which is certainly smaller than the number of weights you have in your CNN. Remember that you want at least 5 times more instances in your training set than weights in your CNN.

You have two solutions to improve your CNN:

Shake each instance in the training set. You each number about 1 pixel around. It will already multiply your training set by 9.
Use a transformer layer. It will add an elastic deformation to each number at each epoch. It will strengthen a lot the learning by artificially increase your training set. Moreover, it will make it much more effective to predict other fonts.

Upvotes: 0

Santiago Regojo

Reputation: 405

It may be an overfitting problem. It could happen when your network is too complex for the problem to resolve. Check this article: http://es.mathworks.com/help/nnet/ug/improve-neural-network-generalization-and-avoid-overfitting.html

Upvotes: 0

Digit Recognition on CNN

Answers (3)

Related Questions