BillTheKid
BillTheKid

Reputation: 447

Is memory supposed to be this high during model.fit using a generator?

The tensorflow versions that I can still recreate this behavior are: 2.7.0, 2.7.3, 2.8.0, 2.9.0. Actually, these are all the versions I've tried; I wasn't able to resolve the issue in any version.

I am trying to feed my data to a model using a generator:

class DataGen(tf.keras.utils.Sequence):
    def __init__(self, indices, batch_size):
        self.X = X
        self.y = y
        self.indices = indices
        self.batch_size = batch_size
    
    def __getitem__(self, index):
        X_batch = self.X[self.indices][
            index * self.batch_size : (index + 1) * self.batch_size
        ]
        y_batch = self.y[self.indices][
            index * self.batch_size : (index + 1) * self.batch_size
        ]
        return X_batch, y_batch
    
    def __len__(self):
        return len(self.y[self.indices]) // self.batch_size

train_gen = DataGen(train_indices, 32)
val_gen = DataGen(val_indices, 32)
test_gen = DataGen(test_indices, 32)

where X, y is my dataset loaded from a .h5 file using h5py, and train_indices, val_indices, test_indices are the indices for each set that will be used on X and y.

I am creating the model and feeding the data using:

# setup model
base_model = tf.keras.applications.MobileNetV2(input_shape=(128, 128, 3),
                                                include_top=False)
base_model.trainable = False

mobilenet1 = Sequential([
    base_model,
    Flatten(),
    Dense(27, activation='softmax')
])

mobilenet1.compile(optimizer=tf.keras.optimizers.Adam(),
                   loss=tf.keras.losses.CategoricalCrossentropy(),
                   metrics=['accuracy'])
# model training
hist_mobilenet = mobilenet1.fit(train_gen, validation_data=val_gen, epochs=1)

The memory right before training is 8%, but the moment training starts it begins getting values from 30% up to 60%. Since I am using a generator and loading the data in small parts of 32 observations at a time, it seems odd to me that the memory climbs this high. Also, even when training stops, memory stays above 30%. I checked all global variables but none of them has such a large size. If I start another training session memory starts having even higher usage values and eventually jupyter notebook kernel dies.

Is something wrong with my implementation or this is normal?

Edit 1: some additional info.

Upvotes: 5

Views: 3198

Answers (3)

BillTheKid
BillTheKid

Reputation: 447

How to minimize RAM usage

From the very helpful comments and answers of our fellow friends, I came to this conclusion:

  • First, we have to save the data to an HDF5 file, so we would not have to load the whole dataset in memory.
import h5py as h5
import gc
file = h5.File('data.h5', 'r')
X = file['X']
y = file['y']
gc.collect()

I am using garbage collector just to be safe.

  • Then, we would not have to pass the data to the generator, as the X and y will be same for training, validation and testing. In order to differentiate between the different data, we will use index maps
# split data for validation and testing
val_split, test_split = 0.2, 0.1

train_indices = np.arange(len(X))[:-int(len(X) * (val_split + test_split))]
val_indices = np.arange(len(X))[-int(len(X) * (val_split + test_split)) : -int(len(X) * test_split)]
test_indices = np.arange(len(X))[-int(len(X) * test_split):]


class DataGen(tf.keras.utils.Sequence):
    def __init__(self, index_map, batch_size):
        self.X = X
        self.y = y
        self.index_map = index_map
        self.batch_size = batch_size
    
    def __getitem__(self, index):
        X_batch = self.X[self.index_map[
            index * self.batch_size : (index + 1) * self.batch_size
        ]]
        y_batch = self.y[self.index_map[
            index * self.batch_size : (index + 1) * self.batch_size
        ]]
        return X_batch, y_batch
    
    def __len__(self):
        return len(self.index_map) // self.batch_size

train_gen = DataGen(train_indices, 32)
val_gen = DataGen(val_indices, 32)
test_gen = DataGen(test_indices, 32)
  • Last thing to notice is how I implemented the the data fetching inside __getitem__.

Correct solution:

X_batch = self.X[self.index_map[
            index * self.batch_size : (index + 1) * self.batch_size
        ]]

Wrong solution:

X_batch = self.X[self.index_map][
            index * self.batch_size : (index + 1) * self.batch_size
        ]

same for y

Notice the difference? In the wrong solution I am loading the whole dataset (training, validation or testing) in memory! Instead, in the correct solution I am only loading the batch meant to feed in the fit method.

With this setup, I managed to raise RAM only to 2.88 GB, which is pretty cool!

Upvotes: 3

Ashwin Raikar
Ashwin Raikar

Reputation: 128

Make use of fit_generator instead of the fit method

I mean instead of

hist_mobilenet = mobilenet1.fit(train_gen, validation_data=val_gen, epochs=1)

Use

hist_mobilenet = mobilenet1.fit_generator(train_gen, validation_data=val_gen, epochs=1)

according to this answer it says

Keras' fit method loads all the data into memory at once meaning changing your batch size will have no effect on the RAM it takes up. Have a look at using which is designed for use with a large dataset.

I think the fit_generator will load data batch-wise and not take up the whole ram instantly.

Upvotes: 0

stahh
stahh

Reputation: 157

Literally, this is not a generator. When you instantiate DataGen, you create a complete class with full indices (def init (self, indices, batch_size)), with datasets (self.X, self.Y), with inheritance from Sequential, and so on.

The simplest real generator for tensorflow looks something like this:

from sklearn.model_selection import train_test_split

BATCH_SIZE = 32
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
X_val = X_train[int(len(X_train) * 0.8):]
X_train = X_train[int(len(X_train) * 0.8)]
y_val = y_train[int(len(y_train) * 0.8):]
y_train = y_train[:int(len(y_train) * 0.8)]

def gen_reader(X_train, y_train):
    for data, label in zip(X_train, y_train):
        yield data, label

train_ds = tf.data.Dataset.from_generator(gen_reader, args=[X_train, y_train], output_types=(tf.float64, tf.int8)).batch(BATCH_SIZE).prefetch(buffer_size=AUTOTUNE)
val_ds = tf.data.Dataset.from_generator(gen_reader, args=[X_val, y_val], output_types=(tf.float64, tf.int8)).batch(BATCH_SIZE).prefetch(buffer_size=AUTOTUNE)
test_ds = tf.data.Dataset.from_generator(gen_reader, args=[X_test, y_test], output_types=(tf.float64, tf.int8)).batch(BATCH_SIZE).prefetch(buffer_size=AUTOTUNE)

...

hist_mobilenet = mobilenet1.fit(train_ds, validation_data=val_ds, epochs=1)

Upvotes: 5

Related Questions