Reputation: 2279
I'm reimplementing MobileNet, but I find the depthwise convolution is no faster than conv2d(I haven't included the 1 by 1 pointwise convolution yet). Here's the test code run on colab: https://colab.research.google.com/drive/1nBuYrmmH5kM0jbtIZdsuiG6uJbU6mpA7?usp=sharing
import tensorflow as tf
import time
x = tf.random.normal((2, 64, 64, 3))
conv = tf.keras.layers.Conv2D(16, 3, strides=1, padding='same')
dw = tf.keras.layers.DepthwiseConv2D(3, padding='same')
start = time.time()
conv(x)
print('conv2d:', time.time() - start) # approximate 0.0036s
start = time.time()
dw(x)
print('dw:', time.time() - start) # approximate 0.0034s
%timeit conv(x) # 1000 loops, best of 3: 225 µs per loop
%timeit dw(x) # 1000 loops, best of 3: 352 µs per loop
I also try it on my laptop using CPUs only, similar results are spotted. Why would DepthwiseConv2D
be slower than Conv2D
? Did I make any mistakes?
Upvotes: 3
Views: 2657
Reputation: 26048
Although more memory efficient, depthwise 2D convolutions can indeed be slower than regular 2D convolutions.
Gholami et al. (SqueezeNext: Hardware-Aware Neural Network Design) states that:
The reason for this is the inefficiency of depthwise-separable convolution in terms of hardware performance, which is due to its poor arithmetic intensity (ratio of compute to memory operations).
Upvotes: 4