Reputation: 81
in c++, there is bitset
. I can convert boolean array to int (by using to_ulong). also i can use ulong as char buffer.
in python what can convert boolean array to char array? \
Specifically I have a boolean numpy array,
For an array of shape [n, b]
I want to get [n, b/8] shape array
Right now, I am creating a char array by combining boolean, but this seems to be slow.
any good way to improve the speed?
import numpy as np
def joiner(X):
return sum(X[:, i] * 2**i for i in range(8))
arr = np.random.randint(0, 2, [100000, 8 * 1024])
cov = np.split(arr, (arr.shape[-1] + 7) // 8, axis = -1)
cov = np.stack(list(map(joiner, cov)), axis = -1)
CPU times: user 17.5 s, sys: 7.52 s, total: 25 s
Wall time: 25.1 s
Upvotes: 1
Views: 233
Reputation: 81
generate data
%%time
import numpy as np
arr = np.random.randint(0, 2, [100000, 8 * 1024], dtype=np.bool_)
CPU times: user 1.32 s, sys: 1.24 s, total: 2.55 s Wall time: 2.14 s
fastest convert method
%%time
cov = sum(arr[i::8] * np.uint8(2**i) for i in range(8))
CPU times: user 190 ms, sys: 94.5 ms, total: 285 ms Wall time: 284 ms
@D.Manasreh
%%time
mask = np.array([2**i for i in range(8)])
cov = np.sum(mask * arr.reshape((-1, 8)), axis=1).reshape(arr.shape[0], -1)
CPU times: user 4.54 s, sys: 9.67 s, total: 14.2 s Wall time: 14.7 s
other experiences
%%time
cov = arr.reshape(-1, 8) @ (2 ** np.arange(8)).astype(np.uint8)
CPU times: user 717 ms, sys: 96 ms, total: 813 ms Wall time: 811 ms
%%time
cov = arr.reshape(-1, 32) @ (2 ** np.arange(32)).astype(np.uint32)
CPU times: user 918 ms, sys: 497 ms, total: 1.42 s Wall time: 1.43 s
Upvotes: 0
Reputation: 940
You can do this:
mask = np.array([2**i for i in range(8)])
cov = np.sum(mask * arr.reshape((-1, 8)), axis=1).reshape(arr.shape[0], -1)
Upvotes: 1