Why does keras neural network predicts the same number for all different images?

Question

I'm trying to use keras neural network of tensorflow to recognize the handwriting digit number. But idk why when i call predict(), it returns same results for all of input images.

Here is code:

  ### Train dataset ###
  mnist = tf.keras.datasets.mnist
  (x_train, y_train), (x_test, y_test) = mnist.load_data()
  x_train = x_train/255
  x_test = x_test/255

  model = tf.keras.models.Sequential()
  model.add(tf.keras.layers.Flatten(input_shape=(28,28)))
  model.add(tf.keras.layers.Dense(units=128,activation=tf.nn.relu))
  model.add(tf.keras.layers.Dense(units=10,activation=tf.nn.softmax))

  model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])

  model.fit(x_train, y_train, epochs=5)

The result looks like this:

Epoch 1/5
1875/1875 [==============================] - 2s 672us/step - loss: 0.2620 - accuracy: 0.9248
Epoch 2/5
1875/1875 [==============================] - 1s 567us/step - loss: 0.1148 - accuracy: 0.9658
Epoch 3/5
1875/1875 [==============================] - 1s 559us/step - loss: 0.0784 - accuracy: 0.9764
Epoch 4/5
1875/1875 [==============================] - 1s 564us/step - loss: 0.0596 - accuracy: 0.9817
Epoch 5/5
1875/1875 [==============================] - 1s 567us/step - loss: 0.0462 - accuracy: 0.9859

Then the code to use image to test is below:

  img = cv.imread('path/to/1.png')
  img = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
  img = cv.resize(img,(28,28))
  img = np.array([img])
    
  if cv.countNonZero((255-image)) == 0:
     print('')
  img = np.invert(img)
    
  plt.imshow(img[0])
  plt.show()
    
  prediction = model.predict(img)
  result = np.argmax(prediction)
  print(prediction)
  print(f'Result: {result}')

The result is:

plt show:

[[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]]
Result: 3

plt show

[[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]]
Result: 3

Innat · Accepted Answer

Normalize your data in inference time same what you did on the training set

img = np.array([img]) / 255

Check this answer (Inference) for more details.

Based on your 3rd comment, here are some details.

def input_prepare(img):            
    img = cv2.resize(img, (28, 28))   
    img = cv2.bitwise_not(img)   

    img = tf.cast(tf.divide(img, 255) , tf.float64)              
    img = tf.expand_dims(img, axis=0)   
    return img 

img = cv2.imread('/content/1.png')
orig = img.copy() # save for plotting later on 

img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # gray scaling 
img = input_prepare(img)

plt.imshow(tf.reshape(img, shape=[28, 28]))

plt.imshow(cv2.cvtColor(orig, cv2.COLOR_BGR2RGB))
plt.title(np.argmax(model.predict(img)))
plt.show()

It works as expected. But because of resizing the image, the digits get broken and lose their spatial information. That seems ok for the model but if it gets much worse, then the model will predict wrong. A case examples

and the model predicts wrong for this.

plt.imshow(cv2.cvtColor(orig, cv2.COLOR_BGR2RGB))
plt.title(np.argmax(model.predict(img)))
plt.show()

To fix this we can apply cv2.erode to add some pixel after resizing, for example

def input_prepare(img):            
    img = cv2.resize(img, (28, 28))   
    img = cv2.erode(img, np.ones((2, 2)))
    img = cv2.bitwise_not(img)   

    img = tf.cast(tf.divide(img, 255) , tf.float64)              
    img = tf.expand_dims(img, axis=0)   
    return img

Not the best approach perhaps but now the model will understand better.

plt.imshow(cv2.cvtColor(orig, cv2.COLOR_BGR2RGB))
plt.title(np.argmax(model.predict(img)))
plt.show()

Why does keras neural network predicts the same number for all different images?

Answers (1)

Related Questions