N.Messers
N.Messers

Reputation: 60

Load several Images without label in keras cnn

I have several .jpeg images with different names, that I want to load into a cnn in a jupyter notebook to have them classified. The only way I found was:

test_image = image.load_img("name_of_picute.jpeg",target_size=(64,64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis=0)
result = cnn.predict(test_image)

All the other things found at the Keras API like tf.keras.preprocessing.image_dataset_from_directory()seems to only work on labeled data. Sadly I can't "simply" iterate over the name of the pictures a they are named differently, is there a way to predict all of them at once without naming every single picture?

Thanks for yout help,

Nick

Upvotes: 0

Views: 1148

Answers (3)

Suman
Suman

Reputation: 738

store image file in a sub directory like this:

  train_dataset
     |
     |--class_0
        | 
        |- <images>

Now, You can use ImageDataGenerator to load the image for the directory (in our case, it is from "train_dataset/class_0" )

from tensorflow.keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
    rescale=1./255,  # Normalize the images
    rotation_range=20,  # Random rotations
    width_shift_range=0.2,  # Horizontal shifts
    height_shift_range=0.2,  # Vertical shifts
    shear_range=0.2,  # Shear transformations
    zoom_range=0.2,  # Random zoom
    horizontal_flip=True,  # Horizontal flips
    fill_mode='nearest',  # Filling strategy for new pixels
    validation_split=0.2
)

train_generator = datagen.flow_from_directory(
    image_directory,
    target_size=(64, 64),  # Adjust based on your needs
    batch_size=8,
    class_mode=None,  # Unsupervised learning
    shuffle=True,
    subset='training'  # Set as training data
)

validation_generator = datagen.flow_from_directory(
    image_directory,
    target_size=(64, 64),  # Adjust based on your needs
    batch_size=8,
    class_mode=None,  # Unsupervised learning
    shuffle=True,
    subset='validation'  # Set as validation data
)

Here image_director : "train_dataset/" i.e parent directory Note that class_model = None`, specify that we don't want labels

To get the batch:

for batch in train_generator:
    print(batch)
    break

Upvotes: 0

paulgavrikov
paulgavrikov

Reputation: 1891

There are multiple ways, for larger data it is useful to use a tf.data.DataSet as it can be tweaked for performance quite easily. I will give you the non-performance-optimized code. Replace <YOUR PATH INCL. REGEX> with the path like ../input/pokemon-images-and-types/images/*/*.

import tensorflow as tf
from tensorflow.data.experimental import AUTOTUNE


def load(file_path):
    img = tf.io.read_file(file_path)
    img = tf.image.decode_jpeg(img, channels=3)
    
    ... # do some preprocessing like resizing if necessary

    return img


list_ds = tf.data.Dataset.list_files(str('<YOUR PATH INCL. REGEX>'), shuffle=True)  # Get all images from subfolders
train_dataset = list_ds.take(-1)
# Set `num_parallel_calls` so multiple images are loaded/processed in parallel.
train_dataset = train_dataset.map(load, num_parallel_calls=AUTOTUNE)

Upvotes: 0

Alexandre Catalano
Alexandre Catalano

Reputation: 772

The solutiontf.keras.preprocessing.image_dataset_from_directory can be updated to return the dataset and the image_path as explained here -> https://stackoverflow.com/a/63725072/4994352

Upvotes: 3

Related Questions