Matrix Multiplication using divide and conquer in Python

I wrote a program in python3 to find out a product of 2 n*n matrices, (where n is a power of 2).

Why the code below does not work out and shows IndexError: invalid index to scalar variable?

import numpy as np

def product(x, y, k):
    def fsum(p, q, m):
        r = [[p[i, j] + q[i, j] for j in range(m)] for i in range(m)]
        return r

    if k == 1:
        return x[0][0] * y[0][0]
    else:
        A = x[0:(k // 2), 0:(k // 2)]
        B = x[0:(k // 2), (k // 2):k]
        C = x[(k // 2):k, 0:(k // 2)]
        D = x[(k // 2):k, (k // 2):k]

        E = y[0:(k // 2), 0:(k // 2)]
        F = y[0:(k // 2), (k // 2):k]
        G = y[(k // 2):k, 0:(k // 2)]
        H = y[(k // 2):k, (k // 2):k]

        C00 = fsum(product(A, E, k // 2), product(B, G, k // 2), k // 2)
        C01 = fsum(product(A, F, k // 2), product(B, H, k // 2), k // 2)
        C10 = fsum(product(C, E, k // 2), product(D, G, k // 2), k // 2)
        C11 = fsum(product(C, F, k // 2), product(D, H, k // 2), k // 2)

        return np.array([[C00, C01], [C10, C11]])

n = int(input('Enter index(power of 2): '))
print('Input 1st matrix')
a = np.array([[int(_) for _ in input().split()] for x in range(n)])
print('Input 2nd matrix')
b = np.array([[int(_) for _ in input().split()] for x in range(n)])
print(product(a, b, n))

Upvotes: 0

Views: 699

Answers (2)

hpaulj
hpaulj

Reputation: 231510

A sample run:

Enter index(power of 2): 2
Input 1st matrix
1 2
3 4
[[1 2]
 [3 4]]
Input 2nd matrix
3 4
5 6
[[3 4]
 [5 6]]
Traceback (most recent call last):
  File "stack51061196.py", line 35, in <module>
    print(product(a, b, n))
  File "stack51061196.py", line 21, in product
    C00 = fsum(product(A, E, k // 2), product(B, G, k // 2), k // 2)
  File "stack51061196.py", line 5, in fsum
    r = [[p[i, j] + q[i, j] for j in range(m)] for i in range(m)]
  File "stack51061196.py", line 5, in <listcomp>
    r = [[p[i, j] + q[i, j] for j in range(m)] for i in range(m)]
  File "stack51061196.py", line 5, in <listcomp>
    r = [[p[i, j] + q[i, j] for j in range(m)] for i in range(m)]
IndexError: invalid index to scalar variable.

Adding a print('p,q', p, q, type(p), type(q)) to the fsum I see

p,q 1 6 <class 'numpy.int64'> <class 'numpy.int64'>

So p is a np.int64 object, not an array. It's already been indexed, and can't go any further.

In [193]: x = np.array([1])[0]
In [194]: x
Out[194]: 1
In [195]: type(x)
Out[195]: numpy.int64
In [196]: x[0,0]
IndexError: invalid index to scalar variable.

Another diagnostic print

print('product AE',k,type(product(A, E, k // 2)))

displays

product AE 2 <class 'numpy.int64'>

So when k is 2, product returns a scalar variable, an int64 object, not an array. Passing that to fsum produces the error.

It's this branch of the product that causes the problem - x[0][0] (why not x[0,0]?) is an element of x:

if k == 1:
    return x[0][0] * y[0][0]

At this point x is (1,1) shape, so you could just write

if k == 1:
     return x * y

With this change I get:

0840:~/mypy$ python3 stack51061196.py 
a,b [[1 2]
 [3 4]] [[1 2]
 [3 4]]
A [[1]] <class 'numpy.ndarray'>
product AE 2 <class 'numpy.ndarray'>
p,q [[1]] [[6]] <class 'numpy.ndarray'> <class 'numpy.ndarray'>
p,q [[2]] [[8]] <class 'numpy.ndarray'> <class 'numpy.ndarray'>
p,q [[3]] [[12]] <class 'numpy.ndarray'> <class 'numpy.ndarray'>
p,q [[6]] [[16]] <class 'numpy.ndarray'> <class 'numpy.ndarray'>
[[[[ 7]]

  [[10]]]


 [[[15]]

  [[22]]]]

which apart from dimensions matches:

In [197]: a=np.array([[1,2],[3,4]])
In [198]: np.dot(a,a)
Out[198]: 
array([[ 7, 10],
       [15, 22]])

Your shape is (2, 2, 1, 1), which can be removed with squeeze, but you really should refine the iteration to get the shape right without it.

It also gets the right numbers for a 4x4 array, but the shape is now (2, 2, 2, 2, 1, 1).

Upvotes: 0

Nils Werner
Nils Werner

Reputation: 36775

Here is one way of doing it:

def product(X, Y):
    k = len(X)
    if k == 1:
        return X * Y
    (A, B), (C, D) = skimage.util.view_as_blocks(X, block_shape=(k // 2, k // 2))
    (E, F), (G, H) = skimage.util.view_as_blocks(Y, block_shape=(k // 2, k // 2))

    out = np.zeros((k, k))
    (I, J), (K, L) = skimage.util.view_as_blocks(out, block_shape=(k // 2, k // 2))
    I[:] = product(A, E) + product(B, G)
    J[:] = product(A, F) + product(B, H)
    K[:] = product(C, E) + product(D, G)
    L[:] = product(C, F) + product(D, H)
    return out

Needless to say it is terribly slow

A = numpy.random.rand(32, 32)

np.allclose(product(A, A), A @ A)
# True
%timeit product(A, A)
# 526 ms ± 11.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit A @ A
# 4.36 µs ± 59.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Upvotes: 2

Related Questions