Reputation: 11
So, i train my neural network with example train data from keras and then I feed it with my own hand-written digit in paint.
import os
import cv2
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)
path = 'C:/Users/pewdu/Desktop/three.png'
img = cv2.imread(path)
new_img = cv2.resize(img, (28, 28))
new_img = new_img[:,:,0] / 255.0 # Take only first channel and normalize
new_img = np.expand_dims(new_img, axis=0) # Adding the dimension
print(new_img.shape) # it equals to (1, 28, 28)
prediction = model.predict(new_img)
The problem is that whatever digit I feed it will give wrong prediction (always only one fixed number). For example, if I feed it number 3 it will response it's 5 and if I feed it another number it also will response 5. Although it correctly works with example testing data.
Also I think problem might be in that my digit has different backgound with example training data. My picture has yellow. That's the image of my pictures
Upvotes: 0
Views: 168
Reputation: 41
It looks like you need to inverse your image. Original images have higher values for pixels of digits, but it seems that in your image background is painted in black, and the digit is painted in white. You need to inverse the colours of your drawing, just change black to white and white to black.
Upvotes: 4
Reputation: 27050
You're wrongly preprocessing your input data.
MNIST is a binary dataset, hence the pixel values are in the range [0, 255] and can assume only the value 0 or 255. Your network learned this.
In order to correctly give in input your image, you have to binarize your input image making it similar to the one your model has been trained on.
You can do it using OpenCV, reading the image in grayscale and applying thresholding:
img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
new_img = cv2.resize(img, (28, 28)) # already single channel
# get a binary image white number, black background
new_img, _ = cv2.threshold(new_img, 127, 255, cv2.THRESH_BINARY_INV)
new_img = new_img / 255. # normalize and make it float
new_img = np.expand_dims(new_img, axis=0) # Adding the batch dimension
Upvotes: 0