Reputation: 97
I need to speed up the function below :
import numpy as np
import itertools
import timeit
def combcol(myarr):
ndims = myarr.shape[0]
solutions = []
for idx1, idx2, idx3, idx4, idx5, idx6 in itertools.combinations(np.arange(ndims), 6):
c1, c2, c3, c4, c5, c6 = myarr[idx1,1], myarr[idx2,2], myarr[idx3,1], myarr[idx4,2], myarr[idx5,1], myarr[idx6,2]
if c1-c2>0 and c2-c3<0 and c3-c4>0 and c4-c5<0 and c5-c6>0 :
solutions.append(((idx1, idx2, idx3, idx4, idx5, idx6),(c1, c2, c3, c4, c5, c6)))
return solutions
X = np.random.random((20, 10))
Y = np.random.random((40, 10))
if __name__=='__main__':
from timeit import Timer
t = Timer(lambda : combcol(X))
t1 = Timer(lambda : combcol(Y))
print('t : ',t.timeit(number=1),'t1 : ',t1.timeit(number=1))
Results :
t : 0.6165180211451455 t1 : 64.49216925614847
The algorithm is too slow for my standard use (myarr.shape[0] = 500).
Is there a NumPy way to decrease execution time of this function (without wasting too much memory)?
Is it possible to implement the problem in Cython?
I've tried using cProfile to see which parts are slow. Most of the time here is spent calling combcol().
import profile
........
........
profile.run('print(len(combcol(Y))); print')
144547
144559 function calls in 39.672 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
144547 0.641 0.000 0.641 0.000 :0(append)
1 0.000 0.000 0.000 0.000 :0(arange)
2 0.000 0.000 0.000 0.000 :0(charmap_encode)
1 0.000 0.000 39.672 39.672 :0(exec)
1 0.000 0.000 0.000 0.000 :0(len)
1 0.000 0.000 0.000 0.000 :0(print)
1 0.000 0.000 0.000 0.000 :0(setprofile)
1 0.094 0.094 39.672 39.672 <string>:1(<module>)
2 0.000 0.000 0.000 0.000 cp850.py:18(encode)
1 38.938 38.938 39.578 39.578 essaiNumpy4.py:13(combcol)
1 0.000 0.000 39.672 39.672 profile:0(print(len(combcol(Y))); print)
0 0.000 0.000 profile:0(profiler)
Finally I have modified the code like this :
def combcol2(myarr):
ndims = myarr.shape[0]
myarr1 = myarr[:,1].tolist()
myarr2 = myarr[:,2].tolist()
solutions = []
for idx1, idx2, idx3, idx4, idx5, idx6 in itertools.combinations(range(ndims), 6):
if myarr1[idx1] > myarr2[idx2] < myarr1[idx3] > myarr2[idx4] < myarr1[idx5] > myarr2[idx6]:
solutions.append(((idx1, idx2, idx3, idx4, idx5, idx6),(myarr1[idx1], myarr2[idx2], myarr1[idx3], myarr2[idx4], myarr1[idx5], myarr2[idx6])))
return solutions
X = np.random.random((40, 10))
if __name__=='__main__':
from timeit import Timer
t = Timer(lambda : combcol2(X))
print('t : ',t.timeit(number=1))
Result :
t : 4.341582240200919
Upvotes: 3
Views: 3143
Reputation: 10139
You could iterate over combinations with a recursion, not using itetools.combinations()
. The benefit of that would be that you could test, e.g., whether c1>c2
, and if False
, you could skip all related combinations.
Upvotes: 2
Reputation: 8711
alko has listed a useful overhaul for your program, and Tim Peters has pointed out that 500-choose-6 is over 21 trillion (ie 21057686727000). This answer will point out a simple speedup to your original program. (I think it's a small speedup compared to what alko's method might offer, but the following is worth noting for future python programming.)
Your selection statement is
if c1-c2>0 and c2-c3<0 and c3-c4>0 and c4-c5<0 and c5-c6>0 :
which is equivalent to
if c1>c2 and c2<c3 and c3>c4 and c4<c5 and c5>c6 :
but in the python interpreter the former takes 67% longer than the latter. For example, here are some sample outputs for the two cases, on my Intel i3-2120 machine (apparently several times faster than your machine) running Python 2.7.5+ :
('t : ', 0.12977099418640137, 't1 : ', 14.45378589630127)
('t : ', 0.0887291431427002, 't1 : ', 8.54729700088501)
The ratio of averages over 4 such runs was 14.529/8.709 = 1.668.
Upvotes: 2
Reputation: 48317
I don't know what exactly you're trying to get, but from your code, constraint for your solutions, idx1 ... idx6 are of form
X[idx1, 1] > X[idx2, 2]
X[idx3, 1] > X[idx2, 2]
X[idx3, 1] > X[idx4, 2]
X[idx5, 1] > X[idx4, 2]
X[idx5, 1] > X[idx6, 2]
You can get (almost instantly for shape <= 500) all admisible pairs (idx1, idx2), (idx3, idx2), (idx3, idx4), (idx5, idx4), idx(5, idx6)
of this form with
idx56 = lst = zip(*np.where((X[:,1].reshape(-1,1)>X[:,2].reshape(1,-1))))
Then you can generate all possible indices with:
import operator
slst = sorted(lst, key=operator.itemgetter(1))
grps = [[x, list(g)] for x, g in itertools.groupby(slst, key=operator.itemgetter(1))]
idx345 = idx123 = [[x1, x2, x3] for _, g in grps for ((x1, x2), (x3, _)) in itertools.combinations(g, 2)]
It took some seconds to compute those lists on my test machine (idx345 was more than 20 billion items long). You just have to join those lists on x3=x3, x5=x5 (I bet solution size will significantly grow with this join, so can't guess it's usefulness).
Upvotes: 2