Jedi Nerd
Jedi Nerd

Reputation: 59

Use only N Images using ImageDataGenerator from each class

There are 10 directories(labels) each with 800 images. I'm trying to use transfer learning to train my model. The data is loaded using ImageDataGenerator as shown below:

train_datagen = ImageDataGenerator(rescale=1./255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    validation_split=0.2) # set validation split

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary',
    subset='training') # set as training data

validation_generator = train_datagen.flow_from_directory(
    train_data_dir, # same directory as training data
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary',
    subset='validation') # set as validation data

model.fit_generator(
    train_generator,
    steps_per_epoch = train_generator.samples // batch_size,
    validation_data = validation_generator, 
    validation_steps = validation_generator.samples // batch_size,
    epochs = nb_epochs)

Is it possible to limit the number of images used from each directory to 100 or N images instead of all 800 images using ImageDataGenerator?

Upvotes: 4

Views: 1247

Answers (1)

Smart Manoj
Smart Manoj

Reputation: 5824

def limit_data(data_dir,n=100):
    a=[]
    for i in os.listdir(data_dir):
        for k,j in enumerate(os.listdir(data_dir+'/'+i)):
            if k>n:continue
            a.append((f'{data_dir}/{i}/{j}',i))
    return pd.DataFrame(a,columns=['filename','class'])

Then use flow_from_dataframe method

Upvotes: 2

Related Questions