Reputation: 3659
In machine learning tutorials using keras, the code to train the machine learning model is this typical one-liner.
model.fit(X_train,
Y_train,
nb_epoch=5,
batch_size = 128,
verbose=1,
validation_split=0.1)
This seems easy when the training data X_train
and Y_train
is small. X_train
and Y_train
are numpy ndarrays. In practical situations, the training data can go into gigabytes which may be too large to be even fitted into the RAM of the computer.
How do you send data into model.fit()
when the training data is too huge?
Upvotes: 3
Views: 453
Reputation: 2060
There is a simple solution for that in Keras. You can simply use python generators, where your data is lazy loaded. If you have Images you can also use the ImageDataGenerator.
def generate_data(x, y, batch_size):
while True:
batch = []
for b in range(batch_size):
batch.append(myDataSlice)
yield np.array(batch )
model.fit_generator(
generator=generate_data(x, y, batch_size),
steps_per_epoch=num_batches,
validation_data=list_batch_generator(x_val, y_val, batch_size),
validation_steps=num_batches_test)
Upvotes: 6