Cptn. Array
Cptn. Array

Reputation: 59

Depthwise separable convolutions require more GPU memory

I have read many papers and web articles that claim that depthwise-separable convolutions reduce the memory required by a deep learning model compared to standard convolution. However, I do not understand how this would be the case, since depthwise-separable convolution requires storing an extra intermediate-step matrix as well as the final output matrix.

Here are two scenarios:

As additional evidence, when using an implementation U-Net in pytorch with typical nn.Conv2d convolutions, the model has 17.3M parameters and a forward/backward pass size of 320MB. If I replace all convolutions with depthwise-separable convolutions, the model has 2M parameters, and a forward/backward pass size of 500MB. So fewer parameters, but more memory required

I am sure I am going wrong somewhere, as every article states that depthwise-separable convolutions require less memory. Where am I going wrong with my logic?

Upvotes: 2

Views: 1037

Answers (0)

Related Questions