Convert an array of times (of events) to an array of number of events up to time x

Question

Suppose I have an array of times. I know a-priori that the maximum time is 1, say, so the array may look like

events = [0.1, 0.2, 0.7, 0.93, 1.37]

The numbers in that array represent when an event has occurred in the time interval [0,1] (and I am ignoring whatever that is larger than 1). I don't know a-priori the size of the array, but I do have reasonable upper bounds on its size (if that matters), so I can even safely truncate it if needed.

I need to convert that array into an array which counts the number of events up to time x, where x is some set of evenly spaced numbers in the interval of times (linspace). So, for example, if the granularity (=size) of that array is 7, the result of my function should look like:

def count_events(events, granularity):
    ...

>>> count_events([0.1, 0.2, 0.7, 0.93, 1.37], 7)
array([0, 1, 2, 2, 2, 3, 4])
# since it checks at times 0, 1/6, 1/3, 1/2, 2/3, 5/6, 1.

I am looking for an efficient solution. Making a loop is probably very easy here, but my event arrays may be huge. In fact, they are not 1D but rather 2D, and this counting operation should be per-axis (like many other numpy functions). To be more precise, here is a 2D example:

def count_events(events, granularity, axis=None):
    ...

>>> events = array([[0.1, 0.2, 0.7, 0.93, 1.37], [0.01, 0.01, 0.9, 2.5, 3.3]])
>>> count_events(events, 7, axis=1)
array([[0, 1, 2, 2, 2, 3, 4],
       [0, 2, 2, 2, 2, 2, 3]])

Divakar · Accepted Answer

You can simply use np.searchsorted -

np.searchsorted(events, d) # with events being a 1D array

, where d is the linspaced array, created like so -

d = np.linspace(0,1,7) # 7 being the interval size

Sample run for the 2D case -

In [548]: events
Out[548]: 
array([[ 0.1 ,  0.2 ,  0.7 ,  0.93,  1.37],
       [ 0.01,  0.01,  0.9 ,  2.5 ,  3.3 ]])

In [549]: np.searchsorted(events[0], d) # Use per row
Out[549]: array([0, 1, 2, 2, 2, 3, 4])

In [550]: np.searchsorted(events[1], d)
Out[550]: array([0, 2, 2, 2, 2, 2, 3])

Using a vectorized version of searchsorted : searchsorted2d, we can even vectorize the whole thing and use on all rows in one go, like so -

In [552]: searchsorted2d(events,d)
Out[552]: 
array([[0, 1, 2, 2, 2, 3, 4],
       [0, 2, 2, 2, 2, 2, 3]])

Convert an array of times (of events) to an array of number of events up to time x

Answers (2)

Be even faster

Related Questions