user1315789
user1315789

Reputation: 3659

Fit data into machine learning keras model when data is huge

In machine learning tutorials using keras, the code to train the machine learning model is this typical one-liner.

model.fit(X_train, 
          Y_train, 
          nb_epoch=5, 
          batch_size = 128, 
          verbose=1, 
          validation_split=0.1)

This seems easy when the training data X_train and Y_train is small. X_train and Y_train are numpy ndarrays. In practical situations, the training data can go into gigabytes which may be too large to be even fitted into the RAM of the computer.

How do you send data into model.fit() when the training data is too huge?

Upvotes: 3

Views: 453

Answers (1)

ixeption
ixeption

Reputation: 2060

There is a simple solution for that in Keras. You can simply use python generators, where your data is lazy loaded. If you have Images you can also use the ImageDataGenerator.

def generate_data(x, y, batch_size):    
    while True:
        batch = []
        for b in range(batch_size):
           batch.append(myDataSlice)

        yield np.array(batch )

model.fit_generator(
generator=generate_data(x, y, batch_size),
steps_per_epoch=num_batches, 
validation_data=list_batch_generator(x_val, y_val, batch_size), 
validation_steps=num_batches_test)

Upvotes: 6

Related Questions