Reputation: 53
Problem: I want to compare each element of a Numpy array with a float, returning an array with the smaller value. For example, using the inputs:
import numpy as np
input_a = 3
input_b = np.array([1,2,3,4,5])
the output should be
output = np.array([1,2,3,3,3])
My current solution is working by making a new np.array with only the constant, then using np.minimum().
c = np.copy(input_b)
c.fill(input_a)
output = np.minimum(input_b, c)
However, I am afraid that this is not the most efficient solution. Is there a more elegant / efficient way to achieve this?
Upvotes: 0
Views: 1333
Reputation: 231540
There's a builtin to do this: clip
output = input_b.clip(max=input_a)
or if you want set input_b
itself
np.clip(input_b, None, out=input_b)
Here it's doing the same as minimum
, but it can also do maximum
in the same call. Some versions accept the max
keyword, others don't.
clip
has a modest edge over minimum
in my timings. But I'd recommend which ever one seems clearest in intent.
Upvotes: 1
Reputation: 176938
I think np.minimum
is fine for this operation:
>>> np.minimum(input_b, 3)
array([1, 2, 3, 3, 3])
If you want to modify input_b
directly, use the out
keyword argument to fill input_b
with the pair-wise minimum values.
>>> np.minimum(input_b, 3, out=input_b)
>>> input_b
array([1, 2, 3, 3, 3])
This is quicker than using boolean indexing and then assigning values:
>>> %timeit input_b[input_b > input_a] = input_a
100000 loops, best of 3: 4.16 µs per loop
>>> %timeit np.minimum(input_b, 3, out=input_b)
100000 loops, best of 3: 2.53 µs per loop
Upvotes: 2
Reputation: 53698
Your best bet is to use logical indexing.
import numpy as np
input_a = 3
input_b = np.array([1,2,3,4,5])
input_b[input_b > input_a] = input_a
print(input_b)
# [1 2 3 3 3]
input_b > input_a
will return a mask array of either True or False values, where in this case the element will be True if the corresponding element in input_b
is greater than input_a
. You can then use this to index input_b
and modify only those values.
Note that using logical indexing is quicker than using numpy.where
for this particular array, though I can't tell you why exactly.
setup = 'from __main__ import np, input_a, input_b'
print(timeit.timeit('input_b[input_b > input_a] = input_a', setup=setup))
# 2.2448947575996456
print(timeit.timeit('np.where(input_b < input_a, input_b, input_a)', setup=setup))
# 5.35540746395358
Upvotes: 3
Reputation: 107347
You can use numpy.where
:
>>> np.where(input_b < input_a, input_b, input_a)
array([1, 2, 3, 3, 3])
Upvotes: 0
Reputation: 22902
A one-liner for this would be to use numpy.where
:
>>> np.where(input_b < input_a, input_b, input_a)
array([ 1., 2., 3., 3., 3.])
Here we pass numpy.where
three arguments, where the first is a boolean array of where input_b < input_a
. Whenever the value in this first argument is True
, we take the value at the corresponding index from the second argument (input_b
). Otherwise we take the value of input_a
.
Edit: In fact, as @Kasra's answer shows, you can pass input_a
directly without converting it to an np.array
.
Upvotes: 0