Reputation: 27
How can I read the images arranged from subfolders using flow_from_dataframe function not flow_from_directory function in Keras? Here is the dataset directory structure arrangement of the dataset with subfolders and CSV file with labels "classes" and image ids I used in the code with output.`
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import pandas as pd
def append_ext(fn):
return fn+".png"
traindf=pd.read_csv("trainLabels.csv",dtype=str)
print(traindf)
traindf["id"]=traindf["id"].apply(append_ext)
print(traindf)
datagen=ImageDataGenerator(rescale=1./255.,validation_split=0.25)
train_generator=datagen.flow_from_dataframe(
dataframe=traindf,
directory="./testdf/",
x_col="id",
y_col="label",
subset="training",
batch_size=32,
seed=42,
shuffle=True,
classes = ["animal_1", "animal_2"],
class_mode="categorical",
target_size=(32,32))
valid_generator=datagen.flow_from_dataframe(
dataframe=traindf,
directory="./testdf/",
x_col="id",
y_col="label",
subset="validation",
batch_size=32,
seed=42,
shuffle=True,
classes = ["animal_1", "animal_2"],
class_mode="categorical",
target_size=(32,32))`.
Found 0 validated image filenames belonging to 2 classes.
Found 0 validated image filenames belonging to 2 classes.
Thanks!
Upvotes: 1
Views: 838
Reputation: 8092
If I understand the directory structure it is like
traindf
------ animal_1
--------frogs_cars_etc
----------------- 1.png
----------------- 2.png
----------------- etc
----------------- 10.png
------ animal_2
-------frogs_cars-etc
----------------- 1.png
----------------- 2.png
----------------- etc
----------------- 10.png
Now it looks to me like there are only 2 classes and 20 total image files in the dataset. The csv file you referenced therefore seems to have NO correlation to the actual data. You can create your own dataframe with the code below but with so few samples I doubt it is going to train at all.
data_dir=r'.\traindf' # main directory
filepaths=[] # store list of filepaths to the images
labels = [] # store list of labels for each image file
classlist= os.listdir(data_dir) # should yield [animal_1, animal_2] these are the classes
for klass in classlist:
classpath=os.path.join(data_dir, klass, 'frogs_cars_etc') #path to get to file list
file_list=os.listdir(classpath) # list of files
for f in file_list: # iterate through the list of files
fpath=os.path.join(classpath, f) # full path to the file
filepaths.append(fpath) # save the filepath
labels.append(klass) # save the label
Fseries=pd.Series(filepaths, name='filepaths')
Lseries=pd.Series(labels, name='labels')
df=pd.concat([Fseries, Lseries], axis=1) # dataframe of form filepaths labels
print (df.head())
You can use the dataframe in flow_from_dataframe but again there are only 20 images so not very useful.
Upvotes: 1