Reputation: 60
I have several .jpeg images with different names, that I want to load into a cnn in a jupyter notebook to have them classified. The only way I found was:
test_image = image.load_img("name_of_picute.jpeg",target_size=(64,64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis=0)
result = cnn.predict(test_image)
All the other things found at the Keras API like tf.keras.preprocessing.image_dataset_from_directory()
seems to only work on labeled data. Sadly I can't "simply" iterate over the name of the pictures a they are named differently, is there a way to predict all of them at once without naming every single picture?
Thanks for yout help,
Nick
Upvotes: 0
Views: 1148
Reputation: 738
store image file in a sub directory like this:
train_dataset
|
|--class_0
|
|- <images>
Now, You can use ImageDataGenerator to load the image for the directory (in our case, it is from "train_dataset/class_0" )
from tensorflow.keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(
rescale=1./255, # Normalize the images
rotation_range=20, # Random rotations
width_shift_range=0.2, # Horizontal shifts
height_shift_range=0.2, # Vertical shifts
shear_range=0.2, # Shear transformations
zoom_range=0.2, # Random zoom
horizontal_flip=True, # Horizontal flips
fill_mode='nearest', # Filling strategy for new pixels
validation_split=0.2
)
train_generator = datagen.flow_from_directory(
image_directory,
target_size=(64, 64), # Adjust based on your needs
batch_size=8,
class_mode=None, # Unsupervised learning
shuffle=True,
subset='training' # Set as training data
)
validation_generator = datagen.flow_from_directory(
image_directory,
target_size=(64, 64), # Adjust based on your needs
batch_size=8,
class_mode=None, # Unsupervised learning
shuffle=True,
subset='validation' # Set as validation data
)
Here image_director
: "train_dataset/" i.e parent directory Note that
class_model = None`, specify that we don't want labels
To get the batch:
for batch in train_generator:
print(batch)
break
Upvotes: 0
Reputation: 1891
There are multiple ways, for larger data it is useful to use a tf.data.DataSet
as it can be tweaked for performance quite easily. I will give you the non-performance-optimized code. Replace <YOUR PATH INCL. REGEX>
with the path like ../input/pokemon-images-and-types/images/*/*
.
import tensorflow as tf
from tensorflow.data.experimental import AUTOTUNE
def load(file_path):
img = tf.io.read_file(file_path)
img = tf.image.decode_jpeg(img, channels=3)
... # do some preprocessing like resizing if necessary
return img
list_ds = tf.data.Dataset.list_files(str('<YOUR PATH INCL. REGEX>'), shuffle=True) # Get all images from subfolders
train_dataset = list_ds.take(-1)
# Set `num_parallel_calls` so multiple images are loaded/processed in parallel.
train_dataset = train_dataset.map(load, num_parallel_calls=AUTOTUNE)
Upvotes: 0
Reputation: 772
The solutiontf.keras.preprocessing.image_dataset_from_directory
can be updated to return the dataset and the image_path as explained here -> https://stackoverflow.com/a/63725072/4994352
Upvotes: 3