Reputation: 5841
I have a project that makes a heavy use of the Numpy function bincount. Now I would like to use pypy to increment the performance. Unfortunately as reported in the numpypy status page there is still no support for the bincount function.
So my questions are:
Is there an alternative function as fast as bincount that can be implemented with numpypy? I look at histogram but it is way too slower and I think using it would defeat the advantages of using pypy. Here's the proof:
Numpy
timeit.timeit("np.bincount(x)", setup="import numpy as np; x = np.array([0] * 20 + [1] * 30)")
0.8197031021118164
Numpypy
timeit.timeit("np.histogram(x)", setup="import numpy as np; x = np.array([0] * 20 + [1] * 30)")
12.335555076599121
I happily see that the numpypy development is very active. So as my project deadline is within a month are there any chances that bincount is going to be implemented within such date?
Upvotes: 3
Views: 933
Reputation: 25833
You could implement bincount by doing something like:
def bincount(x):
result = np.zeros(x.max() + 1, int)
for i in x:
result[i] += 1
You'd have to profile it to know for sure, but because of pypy's jit compiler this should actually be very fast, even if it's not as fast as a pure c implementation. If you try it, I'd like to know how it goes.
Upvotes: 1