Bogdan
Bogdan

Reputation: 608

Tensorflow ValueError: expected flatten_input to have shape: the correct way to load and process image?

I try to work with the tutorial TensorFlow 2 quickstart for beginners. Well, I get it works.

Then I made the image in Paint (7.jpg, 200x200 px).

enter image description here

Now I want the model to try to guess what the number is. I try to process the image:

import tensorflow as tf
import numpy as np


mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# the tutorial example contains epochs=5, but 1 is to run faster and get smaller output
model.fit(x_train, y_train, epochs=1)  

img_path = "7.jpg"
img_raw = tf.io.read_file(img_path)
img_tensor = tf.image.decode_image(img_raw)
img_final = tf.image.resize(img_tensor, [28, 28])
img_final = img_final / 255.0

print("img_final.shape =", img_final.shape)
predict = model.predict(img_final)

and get the output:

Train on 60000 samples                                                                                                                                
60000/60000 [==============================] - 3s 52us/sample - loss: 0.3013 - accuracy: 0.9120                                                       
img_final.shape = (28, 28, 3)                                                                                                                         
Traceback (most recent call last):                                                                                                                    
  File "main.py", line 33, in <module>                                                                                                                
    predict = model.predict(img_final)                                                                                                                
  File "D:\user\python\tensor\venv\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 909, in predict                           
    use_multiprocessing=use_multiprocessing)                                                                                                          
  File "D:\user\python\tensor\venv\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 462, in predict                        
    steps=steps, callbacks=callbacks, **kwargs)                                                                                                       
  File "D:\user\python\tensor\venv\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 396, in _model_iteration               
    distribution_strategy=strategy)                                                                                                                   
  File "D:\user\python\tensor\venv\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 594, in _process_inputs                
    steps=steps)                                                                                                                                      
  File "D:\user\python\tensor\venv\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 2472, in _standardize_user_data           
    exception_prefix='input')                                                                                                                         
  File "D:\user\python\tensor\venv\lib\site-packages\tensorflow_core\python\keras\engine\training_utils.py", line 574, in standardize_input_data      
    str(data_shape))                                                                                                                                  
ValueError: Error when checking input: expected flatten_input to have shape (28, 28) but got array with shape (28, 3)

I added print("img_final.shape =", img_final.shape) to see the shape of input image. I see img_final.shape = (28, 28, 3).

I know the number 3 is the number of channels. From doc tf.io.decode_image:

Note: decode_gif returns a 4-D array [num_frames, height, width, 3], as opposed to decode_bmp, decode_jpeg and decode_png, which return 3-D arrays [height, width, num_channels].

So, I have the image as 3-D array, but the model waits an array with shape = (28, 28) as input.

How can I convert it to (28, 28)? Have I make it monochrome or some other processing?

Update: according to the Ronald and Taras answers I've added convertion to grayscale and some prints. Now I have:

print("img_final.shape =", img_final.shape)

img_final = tf.image.rgb_to_grayscale(img_final)
print("grayscale img_final.shape =", img_final.shape)

img_final = tf.expand_dims(img_final[:, :, :1], axis=0)
print("expand_dims img_final.shape =", img_final.shape)

predict = model.predict(img_final)

And output:

img_final.shape = (28, 28, 3)                                                                                                                         
grayscale img_final.shape = (28, 28, 1)                                                                                                               
expand_dims img_final.shape = (1, 28, 28, 1)
Traceback (most recent call last):
  File "main.py", line 42, in <module> 
    ...
ValueError: Error when checking input: expected flatten_input to have 3 dimensions, but got array with shape (1, 28, 28, 1)

Upvotes: 1

Views: 423

Answers (2)

Taras Khalymon
Taras Khalymon

Reputation: 614

Your model is designed to process several images in one run, so it expects to receive multiple images with the same shape as it was trained. So you should pass tensor with shape (num_images, 28, 28, 1). You want to predict 1 image, so it should be (1, 28, 28, 1). But your image is (200, 200, 3). As you have a gray-scale image, you can just take first channel:

img_final = tf.expand_dims(img_final[:, :, :1], axis=0)

And the run predict.

Instead of taking first channel, you can use tf.image.rgb_to_grayscale() as Ronald answered below.

Upvotes: 1

Ronald Pereira
Ronald Pereira

Reputation: 407

mnist database only have gray scale pictures, so they only have 1 channel of color. So yes, you have to make it monochrome by converting the 3 channels (I suppose it's RGB) to gray scale. You can use

tf.image.rgb_to_grayscale(images)

you can check the documentation for further information about that: tf.image.rgb_to_grayscale doc

Upvotes: 2

Related Questions