Reputation: 887
I am using a customised batch generator in an attempt to fix the problem of incompatible shapes (BroadcastGradientArgs error) while using the standard model.fit() function due to the small size of the last batch in the training data. I used the batch generator mentioned here with the model.fit_generator() function:
class Generator(Sequence):
# Class is a dataset wrapper for better training performance
def __init__(self, x_set, y_set, batch_size=256):
self.x, self.y = x_set, y_set
self.batch_size = batch_size
self.indices = np.arange(self.x.shape[0])
def __len__(self):
return math.floor(self.x.shape[0] / self.batch_size)
def __getitem__(self, idx):
inds = self.indices[idx * self.batch_size:(idx + 1) * self.batch_size] #Line A
batch_x = self.x[inds]
batch_y = self.y[inds]
return batch_x, batch_y
def on_epoch_end(self):
np.random.shuffle(self.indices)
But it seems that it discards the last batch if its size is smaller than the provided batch size. How can I update it to include the last batch and expand it (for example) with some repeated samples?
Also, somehow I don't get how "Line A" works!
Update: here is how I am using the generator in with my model:
# dummy model
input_1 = Input(shape=(None,))
...
dense_1 = Dense(10, activation='relu')(input_1)
output_1 = Dense(1, activation='sigmoid')(dense_1)
model = Model(input_1, output_1)
print(model.summary())
#Compile and fit_generator
model.compile(optimizer='adam', loss='binary_crossentropy')
train_data_gen = Generator(x1_train, y_train, batch_size)
test_data_gen = Generator(x1_test, y_test, batch_size)
model.fit_generator(generator=train_data_gen, validation_data = test_data_gen, epochs=epochs, shuffle=False, verbose=1)
loss, accuracy = model.evaluate_generator(generator=test_data_gen)
print('Test Loss: %0.5f Accuracy: %0.5f' % (loss, accuracy))
Upvotes: 5
Views: 6781
Reputation: 47
Away from the strategy in the other answers, such issue could be tackled in different ways, depending on your scope (intention).
If you wish to repeat some samples in the last batch (until the last batch's size is equal to batch_size) as you suggested in your question, you could (for example) check whether the last sample in the dataset was reached, if so, do something. e.g.:
N_batches = int(np.ceil(len(dataset) / batch_size))
batch_size = 32
batch_counter = 0
while True:
current_batch = []
idx_start = batch_size * batches_counter
idx_end = batch_size * (batches_counter + 1)
for idx in range(idx_start, idx_end):
## Next line sets idx to the index of the last sample in the dataset:
idx = len(dataset)-1 if (idx > len(data_set)-1) else idx
current_batch.append(dataset[idx])
.
.
.
batch_counter += 1
if (batch_counter == N_batches):
batch_counter = 0
Obviously, it doesn't need to be the last sample, it could (for example) be a random sample from the dataset:
idx = random.randint(0,len(dataset) if (idx > len(data_set)-1) else idx
Hope this helps.
Upvotes: 0
Reputation: 2440
I thing the culprit is this line
return math.floor(self.x.shape[0] / self.batch_size)
Replace it with this might work
return math.ceil(self.x.shape[0] / self.batch_size)
Imagine if you have 100 samples and batch size 32. It should divided to 3.125 batches. But if you use math.floor
, it will become 3 and discord 0.125.
As for Line A, if batch size is 32, when index is 1 the [idx * self.batch_size:(idx + 1) * self.batch_size]
will become [32:64]
, in other word, pick the 33th to 64th elements of self.indices
**Update 2, change the input to have a None shape and use LSTM and add evaluate
import os
os.environ['CUDA_VISIBLE_DEVICES'] = ""
import math
import numpy as np
from keras.models import Model
from keras.utils import Sequence
from keras.layers import Input, Dense, LSTM
class Generator(Sequence):
# Class is a dataset wrapper for better training performance
def __init__(self, x_set, y_set, batch_size=256):
self.x, self.y = x_set, y_set
self.batch_size = batch_size
self.indices = np.arange(self.x.shape[0])
def __len__(self):
return math.ceil(self.x.shape[0] / self.batch_size)
def __getitem__(self, idx):
inds = self.indices[idx * self.batch_size:(idx + 1) * self.batch_size] # Line A
batch_x = self.x[inds]
batch_y = self.y[inds]
return batch_x, batch_y
def on_epoch_end(self):
np.random.shuffle(self.indices)
# dummy model
input_1 = Input(shape=(None, 10))
x = LSTM(90)(input_1)
x = Dense(10)(x)
x = Dense(1, activation='sigmoid')(x)
model = Model(input_1, x)
print(model.summary())
# Compile and fit_generator
model.compile(optimizer='adam', loss='binary_crossentropy')
x1_train = np.random.rand(1590, 20, 10)
x1_test = np.random.rand(90, 20, 10)
y_train = np.random.rand(1590, 1)
y_test = np.random.rand(90, 1)
train_data_gen = Generator(x1_train, y_train, 256)
test_data_gen = Generator(x1_test, y_test, 256)
model.fit_generator(generator=train_data_gen,
validation_data=test_data_gen,
epochs=5,
shuffle=False,
verbose=1)
loss = model.evaluate_generator(generator=test_data_gen)
print('Test Loss: %0.5f' % loss)
This run without any problem.
Upvotes: 9