user1658296
user1658296

Reputation: 1418

Numpy: Broadcasting from submatrix

Given two 2D arrays:

A =[[1, 1, 2, 2],
    [1, 1, 2, 2],
    [3, 3, 4, 4],
    [3, 3, 4, 4]]

B =[[1, 2],
    [3, 4]]

A - B = [[ 0, -1,  1,  0],
         [-2, -3, -1, -2],
         [ 2,  1,  3,  2],
         [ 0, -1,  1,  0]]

B's shape is 2,2, A's is 4,4. I want to perform a broadcast subtraction of B over A: A - B.

I specifically want to use broadcasting as the array sizes I am dealing with are very large (8456,8456). I am hoping that broadcasting will provide a small performance optimization.

I've tried reshaping the arrays but with no luck, and am stumped. Scikit is not available to me to use.

Upvotes: 2

Views: 216

Answers (4)

Divakar
Divakar

Reputation: 221684

Approach #1 : Here's an approach using strides that uses the concept of views without making actual copies to then perform subtraction from A and as such should be quite efficient -

m,n = B.strides
m1,n1 = A.shape
m2,n2 = B.shape
s1,s2 = m1//m2, n1//n2
strided = np.lib.stride_tricks.as_strided         
out = A - strided(B,shape=(s1,m2,s2,n2),strides=(0,n2*n,0,n)).reshape(A.shape)

Sample run -

In [78]: A
Out[78]: 
array([[29, 53, 30, 25, 92, 10],
       [ 2, 20, 35, 87,  0,  9],
       [46, 30, 20, 62, 79, 63],
       [44,  9, 78, 33,  6, 40]])

In [79]: B
Out[79]: 
array([[35, 60],
       [21, 86]])

In [80]: m,n = B.strides
    ...: m1,n1 = A.shape
    ...: m2,n2 = B.shape
    ...: s1,s2 = m1//m2, n1//n2
    ...: strided = np.lib.stride_tricks.as_strided
    ...: 

In [81]: # Replicated view
    ...: strided(B,shape=(s1,m2,s2,n2),strides=(0,n2*n,0,n)).reshape(A.shape)
Out[81]: 
array([[35, 60, 35, 60, 35, 60],
       [21, 86, 21, 86, 21, 86],
       [35, 60, 35, 60, 35, 60],
       [21, 86, 21, 86, 21, 86]])

In [82]: A - strided(B,shape=(s1,m2,s2,n2),strides=(0,n2*n,0,n)).reshape(A.shape)
Out[82]: 
array([[ -6,  -7,  -5, -35,  57, -50],
       [-19, -66,  14,   1, -21, -77],
       [ 11, -30, -15,   2,  44,   3],
       [ 23, -77,  57, -53, -15, -46]])

Approach #2 : We can just reshape both A and B to 4D shapes with B having two singleton dimensions along which its elements would be broadcasted when subtracted from 4D version of A. After subtraction, we reshape back to 2D for final output. Thus, we would have an implementation, like so -

m1,n1 = A.shape
m2,n2 = B.shape
out = (A.reshape(m1//m2,m2,n1//n2,n2) - B.reshape(1,m2,1,n2)).reshape(m1,n1)

Upvotes: 2

P. Camilleri
P. Camilleri

Reputation: 13218

If you do not want to tile, you can reshape A to extract (2, 2) blocks, and use broadcasting to substract B:

C = A.reshape(A.shape[0]//2, 2, A.shape[1]//2, 2).swapaxes(1, 2)
C - B
array([[[[ 0, -1],
     [-2, -3]],

    [[ 1,  0],
     [-1, -2]]],


   [[[ 2,  1],
     [ 0, -1]],

    [[ 3,  2],
     [ 1,  0]]]])

And then swap the axis back and reshape:

(C - B).swapaxes(1, 2).reshape(A.shape[0], A.shape[1])

This should be significantly faster, since C is a view on A, not a constructed array.

Upvotes: 0

Nils Werner
Nils Werner

Reputation: 36839

You can expand B by tiling it twice in both dimensions:

print A - numpy.tile(B, (2, 2))

yields

[[ 0 -1  1  0]
 [-2 -3 -1 -2]
 [ 2  1  3  2]
 [ 0 -1  1  0]]

However for big matrices this may create a lot of overhead in RAM.

Alternatively you can view A in blocks using Scikit Image's skimage.util.view_as_blocks and modify it in place

Atmp = skimage.util.view_as_blocks(A, block_shape=(2, 2))
Atmp -= B

print A

which will result, without needlessly repeating B

[[ 0 -1  1  0]
 [-2 -3 -1 -2]
 [ 2  1  3  2]
 [ 0 -1  1  0]]

Upvotes: 2

João Almeida
João Almeida

Reputation: 5087

This should work if A has dimentions that are multiple of B's dimentions:

A - np.tile(B, (int(A.shape[0]/B.shape[0]), int(A.shape[1]/B.shape[1])))

And the result:

array([[ 0, -1,  1,  0],
       [-2, -3, -1, -2],
       [ 2,  1,  3,  2],
       [ 0, -1,  1,  0]])

Upvotes: 1

Related Questions