blueSurfer
blueSurfer

Reputation: 5841

bincount in Numpypy

I have a project that makes a heavy use of the Numpy function bincount. Now I would like to use pypy to increment the performance. Unfortunately as reported in the numpypy status page there is still no support for the bincount function.

So my questions are:

  1. Is there an alternative function as fast as bincount that can be implemented with numpypy? I look at histogram but it is way too slower and I think using it would defeat the advantages of using pypy. Here's the proof:

    Numpy

    timeit.timeit("np.bincount(x)", setup="import numpy as np; x = np.array([0] * 20 + [1] * 30)")
    0.8197031021118164
    

    Numpypy

    timeit.timeit("np.histogram(x)", setup="import numpy as np; x = np.array([0] * 20 + [1] * 30)")
    12.335555076599121
    
  2. I happily see that the numpypy development is very active. So as my project deadline is within a month are there any chances that bincount is going to be implemented within such date?

Upvotes: 3

Views: 933

Answers (1)

Bi Rico
Bi Rico

Reputation: 25833

You could implement bincount by doing something like:

def bincount(x):
    result = np.zeros(x.max() + 1, int)
    for i in x:
        result[i] += 1

You'd have to profile it to know for sure, but because of pypy's jit compiler this should actually be very fast, even if it's not as fast as a pure c implementation. If you try it, I'd like to know how it goes.

Upvotes: 1

Related Questions