Reputation: 1981
I want the resulting array as a binary yes/no.
I came up with
img = PIL.Image.open(filename)
array = numpy.array(img)
thresholded_array = numpy.copy(array)
brightest = numpy.amax(array)
threshold = brightest/2
for b in xrange(490):
for c in xrange(490):
if array[b][c] > threshold:
thresholded_array[b][c] = 255
else:
thresholded_array[b][c] = 0
out=PIL.Image.fromarray(thresholded_array)
but iterating over the array one value at a time is very very slow and I know there must be a faster way, what's the fastest?
Upvotes: 8
Views: 16529
Reputation: 352979
Instead of looping, you can compare the entire array at once in several ways. Starting from
>>> arr = np.random.randint(0, 255, (3,3))
>>> brightest = arr.max()
>>> threshold = brightest // 2
>>> arr
array([[214, 151, 216],
[206, 10, 162],
[176, 99, 229]])
>>> brightest
229
>>> threshold
114
Method #1: use np.where
:
>>> np.where(arr > threshold, 255, 0)
array([[255, 255, 255],
[255, 0, 255],
[255, 0, 255]])
Method #2: use boolean indexing to create a new array
>>> up = arr > threshold
>>> new_arr = np.zeros_like(arr)
>>> new_arr[up] = 255
Method #3: do the same, but use an arithmetic hack
>>> (arr > threshold) * 255
array([[255, 255, 255],
[255, 0, 255],
[255, 0, 255]])
which works because False == 0
and True == 1
.
For a 1000x1000 array, it looks like the arithmetic hack is fastest for me, but to be honest I'd use np.where
because I think it's clearest:
>>> %timeit np.where(arr > threshold, 255, 0)
100 loops, best of 3: 12.3 ms per loop
>>> %timeit up = arr > threshold; new_arr = np.zeros_like(arr); new_arr[up] = 255;
100 loops, best of 3: 14.2 ms per loop
>>> %timeit (arr > threshold) * 255
100 loops, best of 3: 6.05 ms per loop
Upvotes: 17
Reputation: 238081
I'm not sure if your tresholding operation is special, e.g. need to customize it for every pixel or something, but you can just use logical operation on a np.arrays. For example:
import numpy as np
a = np.round(np.random.rand(5,5)*255)
thresholded_array = a > 100; #<-- tresholding on 100 value
print(a)
print(thresholded_array)
Gives:
[[ 238. 201. 165. 111. 127.]
[ 188. 55. 157. 121. 129.]
[ 220. 127. 231. 75. 23.]
[ 76. 67. 75. 141. 96.]
[ 228. 94. 172. 26. 195.]]
[[ True True True True True]
[ True False True True True]
[ True True True False False]
[False False False True False]
[ True False True False True]]
Upvotes: 3