Reputation: 41

2D array Similarity

I am given two 2-dimensional arrays each of size N*N .Two 2D sub arrays of these arrays are said to be similar if atleast K% elements of these subarrays are having same value and same position in both arrays . I need to tell the maximum size of such subarray and the number of such subarrays having this maximum size.

Please help to solve this problem

Example:

Suppose i have two 2d arrays and k=50:

1 3 5

6 7 8

5 7 9

1 5 7

6 7 3

3 7 9

Here , we can see we have out of all submatrices 2 submtracies of 2*2 that are :

A

1 3

6 7

B

1 5

6 7
A

7 8

7 9

B

7 3

7 9

Both these submatrices are having three elements in common,and clearly more than 50 % elements are same.And also the position of both submatrices is also same

But the 3*3 matrix itself has more than 50 % elements same. So the answer will be max size is 3 and number of submatrices is 1

Upvotes: 0

Answers (2)

Jherico

Reputation: 29240

My first pass would be to generate a boolean matrix of the same size as your two inputs, with true meaning the value match at that position, and false meaning they don't. Then I'd find the the smallest square submatrix where the number of trues over the total values exceeds K, starting with the total matrix and then working my way down. Is that what you're doing now?

EDIT:

It seems like the search for the solution in the boolean matrix is an application of density-based clustering. Perhaps an adaptation of the DBSCAN algorithm is possible. It claims to be either O(n LOG n) or O(N^2) depending on whether an indexing structure is used.

Upvotes: 0

Abhishek Bansal

Reputation: 12705

Using these two matrices(A, B), form a third matrix (C)that has an element:

C(i,j) = { 1 if A(i,j) == B(i,j), else zero}

Now the problem is to find the largest submatrix in C that has atleast k% 1s. So if the submatrix is of size mxm, then:

Sum of all elements in this submatrix >= (k/100) *m*m.

This can be done in O(N^3) time as follows:

For each element, calculate the cummulative sum of all elements before it.

D(i,j) = summation ( C(l,h) for l = 0 to i and h = 0 to j )

Now the possible sizes of submatrices can be (nxn) down to (1x1). So for each size (s) matrix from n to 1, loop through the submatrices.

This can be done with 1 pointer p that scans each element and forms the top left corner of the sub-matrix. The top right corner q of submatrix shall be in the same row as p but with a column at p + s.

You can check the sum of elements in submatrix formed by p and q as topleft and topright corners respectively in O(1) time since you know the cumulative sum upto the elements.

Sum of elements in sub-matrix(p->q) for row r 
  = D( r + q - p, q ) - D( r-1, q ) - D( r + q - p, p - 1 ) + D ( r - 1, p - 1 )

Stop when you find a valid matrix or multiple valid matrices of same size.

Upvotes: 1

2D array Similarity

Answers (2)

Related Questions