Omnia
Omnia

Reputation: 887

How to handle the last batch using keras fit_generator

I am using a customised batch generator in an attempt to fix the problem of incompatible shapes (BroadcastGradientArgs error) while using the standard model.fit() function due to the small size of the last batch in the training data. I used the batch generator mentioned here with the model.fit_generator() function:

class Generator(Sequence):
    # Class is a dataset wrapper for better training performance
    def __init__(self, x_set, y_set, batch_size=256):
        self.x, self.y = x_set, y_set
        self.batch_size = batch_size
        self.indices = np.arange(self.x.shape[0])

    def __len__(self):
        return math.floor(self.x.shape[0] / self.batch_size) 

    def __getitem__(self, idx):
        inds = self.indices[idx * self.batch_size:(idx + 1) * self.batch_size] #Line A
        batch_x = self.x[inds]
        batch_y = self.y[inds]
        return batch_x, batch_y

    def on_epoch_end(self):
        np.random.shuffle(self.indices)

But it seems that it discards the last batch if its size is smaller than the provided batch size. How can I update it to include the last batch and expand it (for example) with some repeated samples?

Also, somehow I don't get how "Line A" works!

Update: here is how I am using the generator in with my model:

# dummy model
input_1 = Input(shape=(None,))
...
dense_1 = Dense(10, activation='relu')(input_1)
output_1 = Dense(1, activation='sigmoid')(dense_1)

model = Model(input_1, output_1)
print(model.summary())

#Compile and fit_generator
model.compile(optimizer='adam', loss='binary_crossentropy')

train_data_gen = Generator(x1_train, y_train, batch_size)
test_data_gen = Generator(x1_test, y_test, batch_size)

model.fit_generator(generator=train_data_gen, validation_data = test_data_gen, epochs=epochs, shuffle=False, verbose=1)

 loss, accuracy = model.evaluate_generator(generator=test_data_gen)
print('Test Loss: %0.5f Accuracy: %0.5f' % (loss, accuracy))

Upvotes: 5

Views: 6781

Answers (2)

Mjd Al Mahasneh
Mjd Al Mahasneh

Reputation: 47

Away from the strategy in the other answers, such issue could be tackled in different ways, depending on your scope (intention).

If you wish to repeat some samples in the last batch (until the last batch's size is equal to batch_size) as you suggested in your question, you could (for example) check whether the last sample in the dataset was reached, if so, do something. e.g.:

N_batches = int(np.ceil(len(dataset) / batch_size))
batch_size = 32
batch_counter = 0
while True:
  current_batch = []
  idx_start = batch_size * batches_counter
  idx_end = batch_size * (batches_counter + 1)
  for idx in range(idx_start, idx_end):
      ## Next line sets idx to the index of the last sample in the dataset:
      idx = len(dataset)-1 if (idx > len(data_set)-1) else idx
      current_batch.append(dataset[idx])
      .
      .
      .
  batch_counter += 1
  if (batch_counter == N_batches):
     batch_counter = 0

Obviously, it doesn't need to be the last sample, it could (for example) be a random sample from the dataset:

idx = random.randint(0,len(dataset) if (idx > len(data_set)-1) else idx

Hope this helps.

Upvotes: 0

Natthaphon Hongcharoen
Natthaphon Hongcharoen

Reputation: 2440

I thing the culprit is this line

    return math.floor(self.x.shape[0] / self.batch_size)

Replace it with this might work

    return math.ceil(self.x.shape[0] / self.batch_size) 

Imagine if you have 100 samples and batch size 32. It should divided to 3.125 batches. But if you use math.floor, it will become 3 and discord 0.125.

As for Line A, if batch size is 32, when index is 1 the [idx * self.batch_size:(idx + 1) * self.batch_size] will become [32:64], in other word, pick the 33th to 64th elements of self.indices

**Update 2, change the input to have a None shape and use LSTM and add evaluate

import os
os.environ['CUDA_VISIBLE_DEVICES'] = ""
import math
import numpy as np
from keras.models import Model
from keras.utils import Sequence
from keras.layers import Input, Dense, LSTM


class Generator(Sequence):
    # Class is a dataset wrapper for better training performance
    def __init__(self, x_set, y_set, batch_size=256):
        self.x, self.y = x_set, y_set
        self.batch_size = batch_size
        self.indices = np.arange(self.x.shape[0])

    def __len__(self):
        return math.ceil(self.x.shape[0] / self.batch_size)

    def __getitem__(self, idx):
        inds = self.indices[idx * self.batch_size:(idx + 1) * self.batch_size]  # Line A
        batch_x = self.x[inds]
        batch_y = self.y[inds]
        return batch_x, batch_y

    def on_epoch_end(self):
        np.random.shuffle(self.indices)


# dummy model
input_1 = Input(shape=(None, 10))
x = LSTM(90)(input_1)
x = Dense(10)(x)
x = Dense(1, activation='sigmoid')(x)

model = Model(input_1, x)
print(model.summary())

# Compile and fit_generator
model.compile(optimizer='adam', loss='binary_crossentropy')

x1_train = np.random.rand(1590, 20, 10)
x1_test = np.random.rand(90, 20, 10)
y_train = np.random.rand(1590, 1)
y_test = np.random.rand(90, 1)

train_data_gen = Generator(x1_train, y_train, 256)
test_data_gen = Generator(x1_test, y_test, 256)

model.fit_generator(generator=train_data_gen,
                    validation_data=test_data_gen,
                    epochs=5,
                    shuffle=False,
                    verbose=1)

loss = model.evaluate_generator(generator=test_data_gen)
print('Test Loss: %0.5f' % loss)

This run without any problem.

Upvotes: 9

Related Questions