Reputation: 951
Numpy has a library function, np.unpackbits
, which will unpack a uint8
into a bit vector of length 8. Is there a correspondingly fast way to unpack larger numeric types? E.g. uint16
or uint32
. I am working on a question that involves frequent translation between numbers, for array indexing, and their bit vector representations, and the bottleneck is our pack and unpack functions.
Upvotes: 26
Views: 14268
Reputation: 637
This is the solution I use:
def unpackbits(x, num_bits):
if np.issubdtype(x.dtype, np.floating):
raise ValueError("numpy data type needs to be int-like")
xshape = list(x.shape)
x = x.reshape([-1, 1])
mask = 2**np.arange(num_bits, dtype=x.dtype).reshape([1, num_bits])
return (x & mask).astype(bool).astype(int).reshape(xshape + [num_bits])
This is a completely vectorized solution that works with any dimension ndarray and can unpack however many bits you want.
Upvotes: 21
Reputation: 1
This works for arbitrary arrays of arbitrary uint (i.e. also for multidimensional arrays and also for numbers larger than the uint8 max value).
It cycles over the number of bits, rather than over the number of array elements, so it is reasonably fast.
def my_ManyParallel_uint2bits(in_intAr,Nbits):
''' convert (numpyarray of uint => array of Nbits bits) for many bits in parallel'''
inSize_T= in_intAr.shape
in_intAr_flat=in_intAr.flatten()
out_NbitAr= numpy.zeros((len(in_intAr_flat),Nbits))
for iBits in xrange(Nbits):
out_NbitAr[:,iBits]= (in_intAr_flat>>iBits)&1
out_NbitAr= out_NbitAr.reshape(inSize_T+(Nbits,))
return out_NbitAr
A=numpy.arange(256,261).astype('uint16')
# array([256, 257, 258, 259, 260], dtype=uint16)
B=my_ManyParallel_uint2bits(A,16).astype('uint16')
# array([[0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
# [1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
# [0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]], dtype=uint16)
Upvotes: 0
Reputation: 25672
You can do this with view
and unpackbits
Input:
unpackbits(arange(2, dtype=uint16).view(uint8))
Output:
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0]
For a = arange(int(1e6), dtype=uint16)
this is pretty fast at around 7 ms on my machine
%%timeit
unpackbits(a.view(uint8))
100 loops, best of 3: 7.03 ms per loop
As for endianness, you'll have to look at http://docs.scipy.org/doc/numpy/user/basics.byteswapping.html and apply the suggestions there depending on your needs.
Upvotes: 26
Reputation: 4199
I have not found any function for this too, but maybe using Python's builtin struct.unpack can help make the custom function faster than shifting and anding longer uint (note that I am using uint64).
>>> import struct
>>> N = np.uint64(2 + 2**10 + 2**18 + 2**26)
>>> struct.unpack('>BBBBBBBB', N)
(2, 4, 4, 4, 0, 0, 0, 0)
The idea is to convert those to uint8, use unpackbits, concatenate the result. Or, depending on your application, it may be more convenient to use structured arrays.
There is also built-in bin() function, which produces string of 0s and 1s, but I am not sure how fast it is and it requires postprocessing too.
Upvotes: 3