get same output when making prediction

Question

I'm new in ML. I am trying to make a basic example of image classification containing digits. I created my own dataset but I get a bad accuracy (11%). I have 246 items for training and 62 for testing. Here is my code:

#TRAINING

def load_data(input_path, img_height, img_width):
  data = []
  labels = []
  for imagePath in os.listdir(input_path):  
    labels_path = os.path.join(input_path, imagePath)
    if os.path.isdir(labels_path): 
      for img_path in os.listdir(labels_path):
        labels.append(imagePath)
        img_full_path = os.path.join(labels_path, img_path)
        img = image.load_img(img_full_path, target_size=(img_height, img_width)) 
        img = image.img_to_array(img)
        data.append(img)
  return data, labels



  train_data = []
  train_labels = []
  test_data = []
  test_labels = []
  train_data, train_labels = load_data(train_path, 28, 28)
  test_data, test_labels = load_data(test_path, 28, 28)


  train_data = np.array(train_data)
  train_data = train_data / 255.0
  train_data = tf.reshape(train_data, train_data.shape[:3])
  train_labels = np.array(train_labels)
  train_labels = np.asfarray(train_labels,float)


  test_data = np.array(test_data) 
  test_data = tf.reshape(test_data, test_data.shape[:3])
  test_data = test_data / 255.0
  test_labels = np.array(test_labels)


 model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
  ])


  model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])


  model.fit(train_data, train_labels, batch_size=10, epochs=5, steps_per_epoch=246)

  test_loss, test_acc = model.evaluate(test_data, test_labels, steps=1)
  print('Test accuracy:', test_acc)


#CLASSIFICATION

def classify(input_path):
    if os.path.isdir(input_path):
        images = []
        for file_path in os.listdir(input_path):
            full_path = os.path.join(input_path, file_path)
            img_tensor = preprocess_images(full_path, 28, 28, "L")
            images.append(img_tensor)
        images = np.array(images)
        images = tf.reshape(images,(images.shape[0],images.shape[2],images.shape[3]))
        predictions = model.predict(images, steps = 1)


        for i in range(len(predictions)):
            print("Image", i , "is", np.argmax(predictions[i]))

def preprocess_images(image_path, img_height, img_width, mode):
    img = image.load_img(image_path, target_size=(img_height, img_width))
    #convert 3-channel image to 1-channel
    img = img.convert(mode)
    img_tensor = image.img_to_array(img) 
    img_tensor = np.expand_dims(img_tensor, axis=0)   
    img_tensor /= 255.0
    img_tensor = tf.reshape(img_tensor, img_tensor.shape[:3])
    return tf.keras.backend.eval(img_tensor)

When I make predictions, I always get the result "Image is 5".So, I have 2 questions: - How can I get the other classes [0-9] as output? - Can I get better accuracy by increasing the number of data ?

Thanks.

Stewart_R · Accepted Answer

TLDR;

Your load_data() function is to blame - you need to return the labels of the datasets as an integer rather than the string filepath

Much fuller explanation:

Can I get better accuracy by increasing the number of data ?

In general, yes.

There is nothing intrinsically wrong with your model. I obviously don't have access to the dataset you have created but I can test it on the MNIST dataset (which your dataset is presumably trying to mirror):

(train_data, train_labels),(test_data, test_labels) = tf.keras.datasets.mnist.load_data()


model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
  ])


model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])


model.fit(train_data, train_labels, batch_size=10, epochs=5)

test_loss, test_acc = model.evaluate(test_data, test_labels)
print('Test accuracy:', test_acc)

Having done so, we can train to an accuracy of roughly 93%:

Test accuracy: 0.9275

Your inference code then also works as expected on the test data:

predictions = model.predict(test_data)

for i in range(len(predictions)):
    print("Image", i , "is", np.argmax(predictions[i]))

giving the output, you'd expect:

Image 0 is 7
Image 1 is 2
Image 2 is 1
Image 3 is 0
Image 4 is 4
...

So we know the model can work. So is the difference in performance simply down to the size of your dataset (246) compared to MNIST (60000)?

Well this is an easy thing to test - we can take a similarly sized slice of the MNIST data and repeat the exercise:

train_data = train_data[:246]
train_labels = train_labels[:246]

test_data = test_data[:62]
test_labels = test_labels[:62]

So this time I see a dramatic reduction in the accuracy (c. 66% this time) but I can train the model to a much higher degree of accuracy than you are seeing even with a much smaller subset.

Therefore the issue has to be with your data pre-processing (or the dataset itself).

In fact, looking at you load_data() function, I can see that the problem lies in the labels you are generating. Your labels just appear to the the image path? You have this:

# --snip--

for img_path in os.listdir(labels_path):
  labels.append(imagePath) ## <-- this does not look right!

# --snip--

Whereas you need to populate labels with the integer value for the category your image belongs to (for the mnist digits this is an integer between 0 and 9)

get same output when making prediction

Answers (1)

TLDR;

Much fuller explanation:

Related Questions