Ruan Putka
Ruan Putka

Reputation: 27

How can I solve a system of linear equations in python faster than with numpy.linalg.lstsq?

I am trying to solve a linear system spanning somewhat between hundred thousand and two hundred thousand equations with numpy.linalg.lstsq but is taking waaaaay too long. What can I do to speed this up?

The matrix is sparse with hundreds of columns (the dimensions are approximately 150 000 x 140) and the system is overdetermined.

Upvotes: 0

Views: 3192

Answers (2)

Nbarjest
Nbarjest

Reputation: 761

If your coefficient matrix is sparse, use "spsolve" from "scipy.sparse.linalg".

Upvotes: 0

Paul Panzer
Paul Panzer

Reputation: 53089

Here is a bit of improvised trickery that speeds up the computation quite a bit on random data of the indicated dimensions.

I've no idea how numerically sound this is, though.

import numpy as np
from time import perf_counter

def lstsq(A, b):
    AA = A.T @ A
    bA = b @ A
    D, U = np.linalg.eigh(AA)
    Ap = (U * np.sqrt(D)).T
    bp = bA @ U / np.sqrt(D)
    return np.linalg.lstsq(Ap, bp, rcond=None)

# create random data
A = np.random.random((150_000, 140))
b = np.random.random((150_000,))

# use solver directly
t = perf_counter()
x, *info = np.linalg.lstsq(A, b, rcond=None)
s = perf_counter()
print('direct method:     ', s-t, 'seconds')
# use equivalent reduced system
t = perf_counter()
x_acc, *info_acc = lstsq(A, b)
s = perf_counter()
print('accelerated method:', s-t, 'seconds')
print('results equal:', np.allclose(x, x_acc))

Sample run:

direct method:      3.032766239999546 seconds
accelerated method: 0.20947745100056636 seconds
results equal: True

Upvotes: 1

Related Questions