Reputation: 59
There are 10 directories(labels) each with 800 images. I'm trying to use transfer learning to train my model. The data is loaded using ImageDataGenerator as shown below:
train_datagen = ImageDataGenerator(rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
validation_split=0.2) # set validation split
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary',
subset='training') # set as training data
validation_generator = train_datagen.flow_from_directory(
train_data_dir, # same directory as training data
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='binary',
subset='validation') # set as validation data
model.fit_generator(
train_generator,
steps_per_epoch = train_generator.samples // batch_size,
validation_data = validation_generator,
validation_steps = validation_generator.samples // batch_size,
epochs = nb_epochs)
Is it possible to limit the number of images used from each directory to 100 or N images instead of all 800 images using ImageDataGenerator?
Upvotes: 4
Views: 1247
Reputation: 5824
def limit_data(data_dir,n=100):
a=[]
for i in os.listdir(data_dir):
for k,j in enumerate(os.listdir(data_dir+'/'+i)):
if k>n:continue
a.append((f'{data_dir}/{i}/{j}',i))
return pd.DataFrame(a,columns=['filename','class'])
Then use flow_from_dataframe method
Upvotes: 2