Ajay H
Ajay H

Reputation: 824

Prove Convolution is Equivariant with respect to translation

I was reading the following statement about how convolution is equivariant with respect to translation from the Deep Learning Book.

Let g be a function mapping one image function to another image function, such that I'=g(I) is the image function with I'(x, y) =I(x−1, y). This shifts every pixel ofIone unit to the right. If we apply this transformation to I, then apply convolution, the result will be the same as if we applied convolution to I', then applied the transformation g to the output.

For the last line I bolded, they are applying convolution to I', but shouldn't this be I? I' is the translated image. Otherwise it would effectively be saying:

f(g(I)) = g( f(g(I)) )

where f is convolution & g is translation. I am trying to execute the same myself in python using 3D kernel equal to the depth of the image as would be the case in the convolution layer for a colored image, a house.

Input colored image

Here is my code for applying a translation & then convolution to an image.

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import scipy
import scipy.ndimage

I = scipy.ndimage.imread('pics/house.jpg')

def convolution(A, B):
    return np.sum( np.multiply(A, B) )

k = np.array([[[0,1,-1],[1,-1,0],[0,0,0]], [[-1,0,-1],[1,-1,0],[1,0,0]], [[1,-1,0],[1,0,1],[-1,0,1]]]) #kernel


## Translation
translated = 100
new_I = np.zeros( (I.shape[0]-translated, I.shape[1], I.shape[2]) )

for i in range(translated, I.shape[0]):
    for j in range(I.shape[1]):
        for l in range(I.shape[2]):
            new_I[i-translated,j,l] = I[i,j,l]

## Convolution
conv = np.zeros( (int((new_I.shape[0]-3)/2), int((new_I.shape[1]-3)/2) ) )

for i in range( conv.shape[0] ):
    for j in range(conv.shape[1]):
        conv[i, j] = convolution(new_I[2*i:2*i+3, 2*j:2*j+3, :], k)


scipy.misc.imsave('pics/convoled_image_2nd.png', conv)

I get the following output:

Translated then Convolved

Now, I switch the convolution and Translation steps:

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import scipy
import scipy.ndimage

I = scipy.ndimage.imread('pics/house.jpg')

def convolution(A, B):
    return np.sum( np.multiply(A, B) )

k = np.array([[[0,1,-1],[1,-1,0],[0,0,0]], [[-1,0,-1],[1,-1,0],[1,0,0]], [[1,-1,0],[1,0,1],[-1,0,1]]]) #kernel


## Convolution
conv = np.zeros( (int((I.shape[0]-3)/2), int((I.shape[1]-3)/2) ) )

for i in range( conv.shape[0] ):
    for j in range(conv.shape[1]):
        conv[i, j] = convolution(I[2*i:2*i+3, 2*j:2*j+3, :], k)


## Translation
translated = 100
new_I = np.zeros( (conv.shape[0]-translated, conv.shape[1]) )

for i in range(translated, conv.shape[0]):
    for j in range(conv.shape[1]):
        new_I[i-translated,j] = conv[i,j]


scipy.misc.imsave('pics/conv_trans_image.png', new_I)

And now I get the following output:

First Convolved with kernel & then translated

Shouldn't they be the same according the book? What am I doing wrong?

Upvotes: 3

Views: 927

Answers (1)

Mateen Ulhaq
Mateen Ulhaq

Reputation: 27251

Just as the book says, the linearity properties of convolution and translation guarantee that their order is interchangable, excepting boundary effects.

For instance:

import numpy as np
from scipy import misc, ndimage, signal

def translate(img, dx):
    img_t = np.zeros_like(img)
    if dx == 0:  img_t[:, :]   = img[:, :]
    elif dx > 0: img_t[:, dx:] = img[:, :-dx]
    else:        img_t[:, :dx] = img[:, -dx:]
    return img_t

def convolution(img, k):
    return np.sum([signal.convolve2d(img[:, :, c], k[:, :, c])
        for c in range(img.shape[2])], axis=0)

img = ndimage.imread('house.jpg')

k = np.array([
    [[ 0,  1, -1], [1, -1, 0], [ 0, 0, 0]],
    [[-1,  0, -1], [1, -1, 0], [ 1, 0, 0]],
    [[ 1, -1,  0], [1,  0, 1], [-1, 0, 1]]])

ct = translate(convolution(img, k), 100)
tc = convolution(translate(img, 100), k)

misc.imsave('conv_then_trans.png', ct)
misc.imsave('trans_then_conv.png', tc)

if np.all(ct[2:-2, 2:-2] == tc[2:-2, 2:-2]):
    print('Equal!')

Prints:

Equal!


The problem is that you're overtranslating in the second example. After you shrink the image 2x, try translating by 50 instead.

Upvotes: 1

Related Questions