Dschoni
Dschoni

Reputation: 3862

Count number of occurrences of an array without overlap in another array

I have a mxn matrix A, where m%t = n%t = 0, so that a smaller txt matrix B tiles the matrix without borders or overlaps. I want to check if A consists entirely of tiles from B without calculating a tiling as an intermediate step as efficiently as possible. Furthermore for my special use case, it is not necessary to know B. It is enough to test if A strictly repeats itself every txt tile in every direction.

Numeric examples:

A = [[1, 0, 1, 0],
     [0, 1, 0, 1],
     [1, 0, 1, 0],
     [0, 1, 0, 1]]
B.shape = [2,2]
--> True
B.shape = [1,1]
--> False

So far, I calculate a comparison matrix C, which simply is a tiling of B to fit the size of A:

import numpy as np
x,y      = B.shape
x_a, y_a = A.shape
x_t = x_a/x
y_t = y_a/y
B_dash = A[:x, :y]
C = np.tile(B_dash,(x_t, y_t))
np.count_nonzero(A-C)

Is there a faster way, without calculating C?

Upvotes: 5

Views: 166

Answers (2)

soloice
soloice

Reputation: 1040

It's possible to get the result without generating C. You can do it by:

x,y      = B.shape
x_a, y_a = A.shape
np.array_equal(A[:, :y_a-y], A[:, y:]) and np.array_equal(A[:x_a-x, :], A[x:, :])

That is, to compare A's first y_a-y columns and its last y_a-y columns, then do similar things for rows. I haven't test the code above, but it should be faster, because with this method numpy will not allocate new memories.

The last statement can be optimized to:

np.array_equal(A[:, :y_a-y], A[:, y:]) and np.array_equal(A[:x_a-x, :y_a], A[x:, :y_a])

Because if the first term is True, we've already known that A's columns are repeated t times.

Upvotes: 0

Divakar
Divakar

Reputation: 221584

Appproach #1 : It seems we are counting the number of occurrences of B in A as distinct blocks. So, we can use skimage.util.view_as_blocks -

from skimage.util import view_as_blocks as viewW

out = np.count_nonzero((viewW(A, B.shape) == B).all((2,3)))

Appproach #2 : Staying with NumPy, we would have -

m1,n1 = A.shape
m2,n2 = B.shape
out = np.count_nonzero((A.reshape(m1//m2,m2,n1//n2,n2) == B[:,None]).all((1,3)))

Sample runs -

In [274]: A
Out[274]: 
array([[2, 0, 2, 0],
       [5, 3, 5, 1],
       [3, 3, 2, 6],
       [1, 0, 3, 1]])

In [275]: B
Out[275]: 
array([[3, 3],
       [1, 0]])

In [276]: np.count_nonzero((viewW(A, B.shape) == B).all((2,3)))
Out[276]: 1



In [278]: A
Out[278]: 
array([[2, 0, 3, 3],
       [5, 3, 1, 0],
       [3, 3, 2, 6],
       [1, 0, 3, 1]])

In [279]: B
Out[279]: 
array([[3, 3],
       [1, 0]])

In [280]: np.count_nonzero((viewW(A, B.shape) == B).all((2,3)))
Out[280]: 2

Upvotes: 3

Related Questions