Reputation: 14843
Say I have a very large array of values between 1 and 180, and these are in an array of uint8
s (which can go as high as 255). I would like to add 90 (modulo 180) to each value:
original = np.array([1, 2, 3, 170, 171, 172], dtype=np.uint8)
modified = (original + 90) % 180
Unfortunately, this yields an incorrect result, because the larger numbers overflow their uint8 integers in the first step when adding 90: 170 + 90 = 260 which is greater than 255.
# (170 + 90) % 180 is 80, not 4 :(
array([91, 92, 93, 4, 5, 6], dtype=uint8)
I'm operating in a very performance-sensitive context, and my input list is very large. As such, I would like to avoid the penalty of converting this array to a larger datatype, and I would like to use efficient operations (e.g. avoiding looping through the array and processing each value individually).
How can I accomplish this?
Upvotes: 2
Views: 1190
Reputation: 59731
One very simple option, since you are dealing with uint8
, is to simply compute in advance the result for each possible value in the array and use it:
import numpy as np
original = np.array([1, 2, 3, 170, 171, 172], dtype=np.uint8)
value_map = ((np.arange(256) + 90) % 180).astype(np.uint8)
modified = value_map[original]
print(modified)
# [91 92 93 80 81 82]
The good thing about this is it does not take any additional memory beyond the 256-element value_map
, and for any larger array you will be saving most of the computation too.
Running a time benchmark against casting:
import numpy as np
def add_val_mod_cast(a, val, mod):
return ((a.astype(np.uint16) + val) % mod).astype(np.uint8)
def add_val_mod_map(a, val, mod):
value_map = ((np.arange(256) + val) % mod).astype(np.uint8)
return value_map[a]
np.random.seed(0)
a = np.random.randint(256, size=10_000_000).astype(np.uint8)
val = 90
mod = 180
print((add_val_mod_cast(a, val, mod) == add_val_mod_map(a, val, mod)).all())
# True
%timeit add_val_mod_cast(a, val, mod)
# 72.6 ms ± 2.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit add_val_mod_map(a, val, mod)
# 40.8 ms ± 606 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Upvotes: 3
Reputation: 221704
Here's one with some math and without upcasting -
def add_with_modulus(original, addval, modulusval=180):
v = original + addval
v[modulusval-original<=addval] += 256-modulusval
return v
Usage : add_with_modulus(original, addval=90, modulusval=180)
.
Upvotes: 1
Reputation: 287
import numpy as np
Assign a random numpy array of integers of size (10, 10) having elements whose maximum value can be 255.
np_array = np.random.randint(255, size=(10, 10))
Add 90 to each element and do modulus on it.
np_array = (np_array + 90) % 180
Upvotes: -1
Reputation: 2479
You can simply cast the array to np.unit16
:
>>> ((original.astype(np.uint16) + 90) % 180).astype(np.uint8)
array([91, 92, 93, 80, 81, 82], dtype=uint8)
If you want to achieve this using only uint8
you can just double increment only those elements that are bigger than 255-90
:
>>> modified = (original + 90) % 180
>>> modified[original >= 255-90] += 256-180
>>> modified
array([91, 92, 93, 80, 81, 82], dtype=uint8)
Upvotes: 1