Reputation: 29

When does python start using a different algorithm for big multiplication?

I'm currently in an algorithms class and was interested to see which of two methods of multiplying a list of large numbers gives the faster runtime. What I found was that the recursive multiply performs about 10x faster. For the code below, I got t_sim=53.05s and t_rec=4.73s. I did some other tests and they all seemed to be around the 10x range.

Additionally, you could put the values from the recursive multiply into a tree and reuse them to even more quickly compute multiplications of subsets of the list.

I did a theoretical runtime analysis, and both are n^2 using standard multiplication, but when the karatsuba algorithm is used, that factor goes down to n^log_2(3).

Every multiply in simple_multiply should have runtime i * 1. Summing over i=1...n, we get an arithmetic series and can use gauss's formula to get n*(n+1)/2 = O(n^2).

For the second one, we can see that the time to multiply for a given level of recursion is (2^d)^2, where d is the depth, but only needs to multiply n*2^-d values. The levels turn out to form a geometric series where the runtime at each level is n*2^d with a final depth of log_2(n). The solution to the geometric series is n * (1-2^log_2(n))/(1-2) = n*(n-1) = O(n^2). If using the karatsuba algorithm, you can get O(n^log_2(3)) by doing the same method

If the code were using the karatsuba algorithm, then the speedup would make sense, but what doesn't seem to make sense is the linear relationship between the two runtimes, making it seem like python is using standard multiplication, which according to wikipedia is faster when using under 500ish bits. (I'm using 2^23 bits in the code below. Each number is literally a megabyte long)

import random
import time

def simple_multiply(values):
    a = 1
    for val in values:
        a *= val
    return a

def recursive_multiply(values):
    if len(values) == 1:
        return values[0]
    temp = []
    i = 0
    while i + 1 < len(values):
        temp.append(values[i] * values[i+1])
        i += 2
    if len(values) % 2 == 1:
        temp.append(values[-1])
    return recursive_multiply(temp)

def test(func, values):
    t1 = time.time()
    func(values)
    print( time.time() - t1)

def main():
    n = 2**11
    scale = 2**12
    values = [random.getrandbits(scale) for i in range(n)]
    test(simple_multiply, values)
    test(recursive_multiply, values)
    pass

if __name__ == '__main__':
    main()

Upvotes: 2

Answers (2)

Rocky Li

Reputation: 5958

There is something wrong in your assumption. Your assumption is that each multiplication of the between the different ranks should take same times, for instance len(24)*len(72) approx len(48)*len(48). But that's not true, as evident by the following snippets:

%%timeit
random.getrandbits(2**14)*random.getrandbits(2**14)*random.getrandbits(2**14)*random.getrandbits(2**14)

>>>1000 loops, best of 3: 1.48 ms per loop

%%timeit
(random.getrandbits(2**14)*random.getrandbits(2**14))*(random.getrandbits(2**14)*random.getrandbits(2**14))

>>>1000 loops, best of 3: 1.23 ms per loop

The difference is consistent even on such a small scale

Upvotes: 0

Matt Timmermans

Reputation: 59263

Both versions of the code have the same number of multiplications, but in the simple version each multiplication is ~2000 bits long on average.

In the second version n/2 multiplications are 24 bits long, n/4 are 48 bits long, n/8 are 96 bits long, etc... The average length is only 48 bits.

Upvotes: 2

When does python start using a different algorithm for big multiplication?

Answers (2)

Related Questions