
Reputation: 140

Memory usage of neural network, Keras

I am trying to develop a model for denoising images. I've been reading up on how to calculate memory usage of a neural network and the standard approach seems to be:

    params = depth_n x (kernel_width x kernel_height) x depth_n-1 + depth

By summing all parameters together in my network, I end up getting 1,038,097 which approximates to 4.2MB. It seems I have done a slight miscalculation in the last layer since Keras ends up getting 1,038,497 params. Nevertheless, this is a small difference. 4.2MB is just the parameters, and I've seen somewhere that one should multiply by 3 to include backprop and other needed calculations. This would then approximate to 13MB.

I have approximately 11 GB of GPU memory to work with, yet this model gets exhausted. Where does all the extra needed memory come from? What am I missing? I know this post might be labeled as duplicate, but none of the others seems to catch the topic which I am asking about.

My model:

    def network(self):
        weights = RandomUniform(minval=-0.05, maxval=0.05, seed=None)
        input_img = Input(shape=(self.img_rows, self.img_cols, self.channels))
        conv1 = Conv2D(1024, (3,3), activation='tanh', kernel_initializer=weights,
                padding='same', use_bias=True)(input_img)
        conv2 = Conv2D(64, (3,3), activation='tanh', kernel_initializer=weights,
                padding='same', use_bias=True)(conv1)
        conv3 = Conv2D(64, (3,3), activation='tanh', kernel_initializer=weights,
                padding='same', use_bias=True)(conv2)
        conv4 = Conv2D(64, (3,3), activation='relu', kernel_initializer=weights,
                padding='same', use_bias=True)(conv3)
        conv5 = Conv2D(64, (7,7), activation='relu', kernel_initializer=weights,
                padding='same', use_bias=True)(conv4)
        conv6 = Conv2D(64, (5,5), activation='relu', kernel_initializer=weights,
                padding='same', use_bias=True)(conv5)
        conv7 = Conv2D(32, (5,5), activation='relu', kernel_initializer=weights,
                padding='same', use_bias=True)(conv6)
        conv8 = Conv2D(32, (3,3), activation='relu', kernel_initializer=weights,
                padding='same', use_bias=True)(conv7)
        conv9 = Conv2D(16, (3,3), activation='relu', kernel_initializer=weights,
                padding='same', use_bias=True)(conv8)
        decoded = Conv2D(1, (5,5), kernel_initializer=weights,
                padding='same', activation='sigmoid', use_bias=True)(conv8)
        return input_img, decoded

    def compiler(self):
        self.model.compile(optimizer='RMSprop', loss='mse')

I assume my model is silly in a lot of ways and that there are multiple things to improve (dropout, other filter sizes and numbers, optimizers etc.) and all suggestions are received gladly, but the actual question still remain. Why does this model consume so much memory? Is it due to the extremely high depth of conv1?

Model summary:

    Using TensorFlow backend.
    Layer (type)                 Output Shape              Param #   
    input_1 (InputLayer)         (None, 1751, 480, 1)      0         
    conv2d_1 (Conv2D)            (None, 1751, 480, 1024)   10240     
    conv2d_2 (Conv2D)            (None, 1751, 480, 64)     589888    
    conv2d_3 (Conv2D)            (None, 1751, 480, 64)     36928     
    conv2d_4 (Conv2D)            (None, 1751, 480, 64)     36928     
    conv2d_5 (Conv2D)            (None, 1751, 480, 64)     200768    
    conv2d_6 (Conv2D)            (None, 1751, 480, 64)     102464    
    conv2d_7 (Conv2D)            (None, 1751, 480, 32)     51232     
    conv2d_8 (Conv2D)            (None, 1751, 480, 32)     9248      
    conv2d_10 (Conv2D)           (None, 1751, 480, 1)      801       
    Total params: 1,038,497
    Trainable params: 1,038,497
    Non-trainable params: 0

Upvotes: 8

Views: 7640

Answers (1)


Reputation: 2116

You are correct, this is due to the number of filters in conv1. What you must compute is the memory required to store the activations:

As shown by your model.summary(), the output size of this layer is (None, 1751, 480, 1024). For a single image, this is a total of 1751*480*1024 pixels. As your image is likely in float32, each pixel takes 4 bytes to store. So the output of this layer requires 1751*480*1024*4 bytes, which is around 3.2 GB per image just for this layer.

If you were to change the number of filters to, say, 64, you would only need around 200 MB per image.

Either change the number of filters or change the batch size to 1.

Upvotes: 14

Related Questions