Reputation: 68
I have non-small (10^6) numpy arrays which I then make some computations on. One of the functions simply returns 0 if the value is larger than some value X or return 1 otherwise. I understand this a simple bool check does the job:
x = np.arange(100)
x = np.array(x > X, dtype=int)
However, given that I'm creating a new array and making conversions it seems very wasteful. Any ideas on how to do in place? Something along the lines of x.round() (but that would return either 0 or 1).
Or are my concerns completely unfounded?
Thanks! P
PS: Yes, numpy is as requirement.
Upvotes: 2
Views: 113
Reputation: 13430
Quite frequently, you can get away with passing around the bool
array. When used in arithmetic operations against numerical arrays, the bool
array will be upcast as required, treating True
as 1
and False
as 0
.
But if you really need the most efficient way to get a true int
array, use the np.greater()
ufunc. Like all ufuncs, it accepts an out=
keyword argument that will be used as a pre-allocated array to stuff the results into. It will convert each element on-the-fly so there is no intermediate bool
array being created.
[~]
|1> import numpy as np
[~]
|2> x = np.arange(10)
[~]
|3> output = np.empty(x.shape, dtype=int)
[~]
|4> np.greater(x, 5, out=output)
array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1])
[~]
|5> output
array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1])
Upvotes: 5