Getting 0 and 1s (integer bools) from a numeric numpy array in the the most efficient way

Question

I have non-small (10^6) numpy arrays which I then make some computations on. One of the functions simply returns 0 if the value is larger than some value X or return 1 otherwise. I understand this a simple bool check does the job:

x = np.arange(100)
x = np.array(x > X, dtype=int)

However, given that I'm creating a new array and making conversions it seems very wasteful. Any ideas on how to do in place? Something along the lines of x.round() (but that would return either 0 or 1).

Or are my concerns completely unfounded?

Thanks! P

PS: Yes, numpy is as requirement.

Robert Kern · Accepted Answer

Quite frequently, you can get away with passing around the bool array. When used in arithmetic operations against numerical arrays, the bool array will be upcast as required, treating True as 1 and False as 0.

But if you really need the most efficient way to get a true int array, use the np.greater() ufunc. Like all ufuncs, it accepts an out= keyword argument that will be used as a pre-allocated array to stuff the results into. It will convert each element on-the-fly so there is no intermediate bool array being created.

[~]
|1> import numpy as np

[~]
|2> x = np.arange(10)

[~]
|3> output = np.empty(x.shape, dtype=int)

[~]
|4> np.greater(x, 5, out=output)
array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1])

[~]
|5> output
array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1])

Getting 0 and 1s (integer bools) from a numeric numpy array in the the most efficient way

Answers (1)

Related Questions