Reputation: 1789
I have a numpy array X
of size (n, m)
and of type np.uint8
(so it only contains values in [0, 255]
). I also have a mapping f
from [0, 255]
to [0, 3]
.
I want to create an array Y
of shape (4, n, m)
such that y_{k, i, j} = 1 if k == f(x_{i, j})
and 0 otherwise. For now, I do it like this:
Y = np.zeros((4, n, m))
for i in range(256):
Y[f(i), X == i] = 1
But this is super slow, and I'm not able to find a more efficient way to do it. Any ideas?
Upvotes: 1
Views: 912
Reputation: 231385
The mix of scalar indexing and boolean appears to be hurting your speed:
In [706]: %%timeit
...: Y=np.zeros((4,3,4))
...: for i in range(256):
...: Y[f(i), X==i]+=1
...:
100 loops, best of 3: 12.5 ms per loop
In [722]: %%timeit
...: Y=np.zeros((4,3,4))
...: for i in range(256):
...: I,J=np.where(X==i)
...: Y[f(i),I,J] = 1
...:
100 loops, best of 3: 8.55 ms per loop
This is for
X=np.arange(12,dtype=np.uint8).reshape(3,4)
def f(i):
return i%4
In this case, the f(i)
is not a major time consumer:
In [718]: timeit K=[f(i) for i in range(256)]
10000 loops, best of 3: 120 µs per loop
but getting the X==i
indexes is slow
In [720]: timeit K=[X==i for i in range(256)]
1000 loops, best of 3: 1.29 ms per loop
In [721]: timeit K=[np.where(X==i) for i in range(256)]
100 loops, best of 3: 2.73 ms per loop
We need to rethink the X==i
part of the mapping, rather than the f(i)
part.
=====================
Flattening the last 2 dimensions helps;
In [780]: %%timeit
...: X1=X.ravel()
...: Y=np.zeros((4,12))
...: for i in range(256):
...: Y[f(i),X1==i]=1
...: Y.shape=(4,3,4)
...:
100 loops, best of 3: 3.16 ms per loop
Upvotes: 1
Reputation: 221534
Assuming f
could operate on all iterating values in one-go, you can use broadcasting
-
Yout = (f(X) == np.arange(4)[:,None,None]).astype(int)
Runtime test and verification -
In [35]: def original_app(X,n,m):
...: Y = np.zeros((4, n, m))
...: for i in range(256):
...: Y[f(i), X == i] = 1
...: return Y
...:
In [36]: # Setup Inputs
...: n,m = 2000,2000
...: X = np.random.randint(0,255,(n,m)).astype('uint8')
...: v = np.random.randint(4, size=(256,))
...: def f(x):
...: return v[x]
...:
In [37]: Y = original_app(X,n,m)
...: Yout = (f(X) == np.arange(4)[:,None,None]).astype(int)
...:
In [38]: np.allclose(Yout,Y) # Verify
Out[38]: True
In [39]: %timeit original_app(X,n,m)
1 loops, best of 3: 3.77 s per loop
In [40]: %timeit (f(X) == np.arange(4)[:,None,None]).astype(int)
10 loops, best of 3: 74.5 ms per loop
Upvotes: 1