Convolutional layer in Python using Numpy

Question

I am trying to implement a convolutional layer in Python using Numpy. The input is a 4-dimensional array of shape [N, H, W, C], where:

N: Batch size
H: Height of image
W: Width of image
C: Number of channels

The convolutional filter is also a 4-dimensional array of shape [F, F, Cin, Cout], where

F: Height and width of a square filter
Cin: Number of input channels (Cin = C)
Cout: Number of output channels

Assuming a stride of one along all axes, and no padding, the output should be a 4-dimensional array of shape [N, H - F + 1, W - F + 1, Cout].

My code is as follows:

import numpy as np

def conv2d(image, filter):
  # Height and width of output image
  Hout = image.shape[1] - filter.shape[0] + 1
  Wout = image.shape[2] - filter.shape[1] + 1

  output = np.zeros([image.shape[0], Hout, Wout, filter.shape[3]])

  for n in range(output.shape[0]):
    for i in range(output.shape[1]):
      for j in range(output.shape[2]):
        for cout in range(output.shape[3]):
          output[n,i,j,cout] = np.multiply(image[n, i:i+filter.shape[0], j:j+filter.shape[1], :], filter[:,:,:,cout]).sum()

  return output

This works perfectly, but uses four for loops and is extremely slow. Is there a better way of implementing a convolutional layer that takes 4-dimensional input and filter, and returns a 4-dimensional output, using Numpy?

ZisIsNotZis · Accepted Answer

This a straightforward implementation of this kind of keras-like (?) convolution. It might be hard to understand for beginners because it uses a lot of broadcasting and stride tricks.

from numpy.lib.stride_tricks import as_strided
def conv2d(a, b):
    a = as_strided(a,(len(a),a.shape[1]-len(b)+1,a.shape[2]-b.shape[1]+1,len(b),b.shape[1],a.shape[3]),a.strides[:3]+a.strides[1:])
    return np.einsum('abcijk,ijkd', a, b[::-1,::-1])

BTW: if you are doing convolution with very-big kernel, use Fourier-based algorithm instead.

EDIT: The [::-1,::-1] should be removed in the case that convolution does not involve flipping the kernel first (like what's in tensorflow).

EDIT: np.tensordot(a, b, axes=3) performs much better than np.einsum("abcijk,ijkd", a, b), and is highly recommended. So, the function becomes:

from numpy.lib.stride_tricks import as_strided

def conv2d(a, b):
  Hout = a.shape[1] - b.shape[0] + 1
  Wout = a.shape[2] - b.shape[1] + 1

  a = as_strided(a, (a.shape[0], Hout, Wout, b.shape[0], b.shape[1], a.shape[3]), a.strides[:3] + a.strides[1:])

  return np.tensordot(a, b, axes=3)

Convolutional layer in Python using Numpy

Answers (1)

Related Questions