Reputation: 97
I have a 2D matrix and I need to sum a subset of the matrix elements, given two lists of indices imp_list
and bath_list
. Here is what I'm doing right now:
s = 0.0
for i in imp_list:
for j in bath_list:
s += K[i,j]
which appears to be very slow. What would be a better solution to perform the sum?
Upvotes: 2
Views: 2466
Reputation: 177078
If you're working with large arrays, you should get a huge speed boost by using NumPy's own indexing routines over Python's for
loops.
In the general case you can use np.ix_
to select a subarray of the matrix to sum:
K[np.ix_(imp_list, bath_list)].sum()
Note that np.ix_
carries some overhead, so if your two lists contain consecutive or evenly-spaced values, it's worth using regular slicing to index the array instead (see method3()
below).
Here's some data to illustrate the improvements:
K = np.arange(1000000).reshape(1000, 1000)
imp_list = range(100) # [0, 1, 2, ..., 99]
bath_list = range(200) # [0, 1, 2, ..., 199]
def method1():
s = 0
for i in imp_list:
for j in bath_list:
s += K[i,j]
return s
def method2():
return K[np.ix_(imp_list, bath_list)].sum()
def method3():
return K[:100, :200].sum()
Then:
In [80]: method1() == method2() == method3()
Out[80]: True
In [91]: %timeit method1()
10 loops, best of 3: 9.93 ms per loop
In [92]: %timeit method2()
1000 loops, best of 3: 884 µs per loop
In [93]: %timeit method3()
10000 loops, best of 3: 34 µs per loop
Upvotes: 4