MNIST classifier fails on non-MNIST digits

I just realized that my MNIST digit classifier (a convolutional neural network) fails spectacularly on my own hand-drawn digits, with just around a 55% accuracy (50% on black-on-white images, 60% on white-on-black images).

The result is rather surprising to me, considering that industrial character recognition software is very accurate on completely new characters.

The only explanation I've been able to find online for this is overfitting, which seems unlikely since my model has a 98.4% test accuracy. The other explanation would be that the MNIST data set is not really as general/has a lower dimensionality than one would hope, which seems unlikely. Conversion to grayscale is also not the issue (see comments under Roshin's answer).

Anyone willing to have a look at my model and tell me what's wrong? The testing on my own characters is in the last filled cell.

Upvotes: 0

Views: 561

Answers (1)

Roshin Raphel
Roshin Raphel

Reputation: 2699

The problem is with the lines :

# my own hand-drawn characters
with url.urlopen(img) as file:
  x2 = im.imread(file.read())
# average color channels, put in 1-size batch, normalize
x2 = np.array([[[[np.mean(entry)] for entry in row] for row in x2]]) / 255.0

You should convert image to grey scale.

# my own hand-drawn characters
with url.urlopen(img) as file:
  pil_im = Image.open(file.read()).convert('L')
  img = np.array(pil_im) / 255.0
# average color channels, put in 1-size batch, normalize

You should import the Image module from PIL beforehand,

from PIL import Image

Upvotes: 1

Related Questions