Hashemi Emad
Hashemi Emad

Reputation: 103

time consuming matrix operation

I make iterative recursive weighted least square regression. There are two major parts : find the weights and fit the regression.

The fit consists of statsmodels.regression.linear_model.WLS.fit (2 matrix multiplication, 1 matrix inversion and 3 other matrix mult) and takes around 3 ms.

Fiding weights consists of substracting two arrays, divide each element by a scalar, square each element, find the opposite, add 1, find max between each and 0 (epanechnikov kernel on the standardised errors of the fit)

        err = y - y_hat
        h = np.std(err) * c
        w=np.maximum(0,1-(err/h)**2)

but it takes 30ms. I don't understand why matrix inversion would take 10 times less time. We are talking about 3000x3000 matrices and 3000x1 arrays (y, y_hat, err and w), c is scalar and depends on the size (a function of 3000 in this example). The most consuming is the third line (>80% of calc time).

Now this doesn't seem like a lot, but I have to do a whole lot of these.

How can I accelerate this process?

Upvotes: 0

Views: 207

Answers (1)

asylumax
asylumax

Reputation: 821

A small point, but you have **2; what about (err/h)*(err/h)?

I thought use of ** was more costly. In Spyder, if I define two functions:

def function_a(i):
    a=i**2
    return(a)

def function_b(i):
    b=i*i
    return(b)

And run from the console, I get these results; multiplication is 3x faster.

%timeit for x in range(100): function_a(x)
30.2 µs ± 1.12 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit for x in range(100): function_b(x)
11.5 µs ± 185 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Upvotes: 1

Related Questions