tsp
tsp

Reputation: 81

boolean numpy array to char numpy array

in c++, there is bitset. I can convert boolean array to int (by using to_ulong). also i can use ulong as char buffer.
in python what can convert boolean array to char array? \

Specifically I have a boolean numpy array,
For an array of shape [n, b]
I want to get [n, b/8] shape array

Right now, I am creating a char array by combining boolean, but this seems to be slow.

any good way to improve the speed?

import numpy as np
def joiner(X):
    return sum(X[:, i] * 2**i for i in range(8))

arr = np.random.randint(0, 2, [100000, 8 * 1024])
cov = np.split(arr, (arr.shape[-1] + 7) // 8, axis = -1)
cov = np.stack(list(map(joiner, cov)), axis = -1)

CPU times: user 17.5 s, sys: 7.52 s, total: 25 s
Wall time: 25.1 s

Upvotes: 1

Views: 233

Answers (2)

tsp
tsp

Reputation: 81

generate data

%%time
import numpy as np
arr = np.random.randint(0, 2, [100000, 8 * 1024], dtype=np.bool_)

CPU times: user 1.32 s, sys: 1.24 s, total: 2.55 s Wall time: 2.14 s


fastest convert method

%%time
cov = sum(arr[i::8] * np.uint8(2**i) for i in range(8))

CPU times: user 190 ms, sys: 94.5 ms, total: 285 ms Wall time: 284 ms


@D.Manasreh

%%time
mask = np.array([2**i for i in range(8)])
cov = np.sum(mask * arr.reshape((-1, 8)), axis=1).reshape(arr.shape[0], -1)

CPU times: user 4.54 s, sys: 9.67 s, total: 14.2 s Wall time: 14.7 s


other experiences

%%time
cov = arr.reshape(-1, 8) @ (2 ** np.arange(8)).astype(np.uint8)

CPU times: user 717 ms, sys: 96 ms, total: 813 ms Wall time: 811 ms

%%time
cov = arr.reshape(-1, 32) @ (2 ** np.arange(32)).astype(np.uint32)

CPU times: user 918 ms, sys: 497 ms, total: 1.42 s Wall time: 1.43 s

Upvotes: 0

D.Manasreh
D.Manasreh

Reputation: 940

You can do this:

mask = np.array([2**i for i in range(8)])
cov = np.sum(mask * arr.reshape((-1, 8)), axis=1).reshape(arr.shape[0], -1)

Upvotes: 1

Related Questions