mashedpotatoes
mashedpotatoes

Reputation: 435

Given a set of images to be identified, and a trained model, how would I make the model identify the images?

How can I make a trained model identify images I extracted from somewhere else?

The model is trained with the MNIST dataset, and the images to be identified by the model are handwritten digits extracted from a document.

Libraries used are tensorflow 2.0, cv2, and numpy.

As I understand, model.predict() identifies its input. By that, I mean if I input a handwritten image of '3' there in some form, it will identify and output '3'. Again, this said model is trained with the MNIST dataset based on this set of tutorials.

Assuming it is, I'd like to know the parameters of the function or how would I format the image/set of images to get my expected output. If not, I'd like to know how I would exactly accomplish this.

import cv2
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from tensorflow import keras

# Load and prepare the MNIST dataset. Convert the samples from integers to floating-point numbers:
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

def createModel():  
  # Build the tf.keras.Sequential model by stacking layers. 
  # Choose an optimizer and loss function used for training:
  model = tf.keras.models.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(10, activation='softmax')
  ])

  model.compile(optimizer='adam',
                loss='sparse_categorical_crossentropy',
                metrics=['accuracy'])

  return model

model = createModel()
model.fit(x_train, y_train, epochs=5, validation_data=(x_test, y_test))
model.evaluate(x_test, y_test)

c = cv2.imread("./3.png", 1)
c = c.reshape(-1, 28*28)/255.0

# now what?

I expected model.predict() would do this what I needed. So far this is my attempts:

model.predict(c) outputs TypeError: predict() missing 1 required positional argument: 'x'

model.predict([""], c) outputs ValueError: When using data tensors as input to a model, you should specify thestepsargument.

And so on.

I know at this point I'm going in blindly and incorrectly. Any step to the right direction is appreciated. Thanks!

EDIT:

So I know the input image c should be a grayscale 28x28 even before reshaping, so I tried skipping that. Error that came out when I implemented prediction is:

...
tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [28,28], In[1]: [784,128]
     [[{{node dense/MatMul}}]] [Op:__inference_keras_scratch_graph_2593]

So I used c = c.reshape(-1, 28*28)/255.0 before prediction, but then it never predicted the right value of any digit.

I then tried to use cv2.imshow(str(predicted_value), c) to show what the input image would look like. The shown image is just of a thin line of black and white spots. As I still can't link images yet, here is the link to the output instead.

My question is, is this what the image is supposed to look like for the model? Or that I may have messed it up? Thanks!

Upvotes: 0

Views: 324

Answers (1)

codeslord
codeslord

Reputation: 2368

As your model is trained with gray scale images it expects the input image to be gray scale. RGB image has 3 channels. Gray scale image has only 1 channel.

So, when loading the image instead of 1 which stands for cv2.IMREAD_COLOR, use 0 corresponding to cv2.IMREAD_GRAYSCALE to load the image in grayscale mode.

(NB: Use -1 for cv2.IMREAD_UNCHANGED Refer to the opencv documentation here for more details)

yourimage = cv2.imread("yourimage.png", 0)

For predicting, after reshaping you can use:

predicted_value = np.argmax(model.predict(yourimage))

Upvotes: 1

Related Questions