Reputation: 187

How to convert an integer array to a specific length binary array

I am trying to convert a numpy integer array, let's say A=[3,5,2], into a numpy binary array with least significant bit first format and specific length. That is, the outcome for length 6 should be as follows:

A' = [1 1 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0]

The first 6 values are for the first element of A, the second 6 of those are for the second element of A and the last 6 of those for the last element of A.

My current solution is as follows:

np.multiply( np.delete( np.unpackbits( np.abs(A.astype(int)).view("uint8")).reshape(-1,8)[:,::-1].reshape(-1,64), np.s_[ln::],1).astype("float64").ravel(), np.repeat(np.sign(A), ln))

where ln represents the specific ln (in the example, it was 6)

Is there any faster way to do this?

Thanks in advance.

EDIT: I should have pointed out before. A can also have negative values. For instance, if A=[-11,5] and ln=6, then the returned array should be:

A'=[-1 -1 0 -1 0 0 1 0 1 0 0 0]

Note that ln=6 is just an example. It could be even 60.

Sorry for missing this part of the requirement.

Upvotes: 3

Answers (3)

Paul Panzer

Reputation: 53089

Maybe I'm ignorant to your solution's full power but it seems to have a few non essential ingredients.

Here is a streamlined version. It checks for endianness and should be good for up to 64 bit on typical platforms.

A = np.arange(-2, 3)*((2**40)-1)
ln = 60

np.unpackbits(np.abs(A[..., None]).view(np.uint8)[..., ::-1] if sys.byteorder=='little' else np.abs(A[..., None]).view(np.uint8), axis=-1)[..., :-ln-1:-1].view(np.int8) * np.sign(A[:, None]).astype(np.int8)

Output

array([[ 0, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1, -1,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0]], dtype=int8)

Upvotes: 1

Divakar

Reputation: 221684

Here's a vectorized one -

((A[:,None] & (1 << np.arange(ln)))!=0).ravel().view('i1')

Another with np.unpackbits -

np.unpackbits(A.view(np.uint8)[::8]).reshape(-1,8)[:,ln-7:1:-1].ravel()

Sample run -

In [197]: A
Out[197]: array([3, 5, 2])

In [198]: ln = 6

In [199]: ((A[:,None] & (1 << np.arange(ln)))!=0).ravel().view('i1')
Out[199]: array([1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0], dtype=int8)

In [200]: np.unpackbits(A.view(np.uint8)[::8]).reshape(-1,8)[:,ln-7:1:-1].ravel()
Out[200]: array([1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0], dtype=uint8)

Timings on a large array -

In [201]: A = np.random.randint(0,6,1000000)

In [202]: ln = 6

In [203]: %timeit ((A[:,None] & (1 << np.arange(ln)))!=0).ravel().view('i1')
10 loops, best of 3: 32.1 ms per loop

In [204]: %timeit np.unpackbits(A.view(np.uint8)[::8]).reshape(-1,8)[:,ln-7:1:-1].ravel()
100 loops, best of 3: 8.14 ms per loop

If you are okay with a 2D array output with each row holding binary info for each element off the input, it's much better -

In [205]: %timeit np.unpackbits(A.view(np.uint8)[::8]).reshape(-1,8)[:,ln-7:1:-1]
1000 loops, best of 3: 1.04 ms per loop

Other posted approaches -

# @aburak's soln
In [206]: %timeit np.multiply( np.delete( np.unpackbits( np.abs(A.astype(int)).view("uint8")).reshape(-1,8)[:,::-1].reshape(-1,64), np.s_[ln::],1).astype("float64").ravel(), np.repeat(np.sign(A), ln))
10 loops, best of 3: 180 ms per loop

# @Jacques Gaudin's soln
In [208]: %timeit np.array([int(c) for i in A for c in np.binary_repr(i, width=6)[::-1]])
1 loop, best of 3: 3.34 s per loop

# @Paul Panzer's soln
In [209]: %timeit np.unpackbits(A[:, None].view(np.uint8)[..., ::-1] if sys.byteorder=='little' else A[:, None].view(np.uint8), axis=-1)[..., :-ln-1:-1].reshape(-1)
10 loops, best of 3: 35.4 ms per loop

The best thing that worked in favour of the second approach from this post is that we have an uint8 dtype version of the input, which is simply a view into the input and hence memory efficient -

In [238]: A
Out[238]: array([3, 5, 2])

In [239]: A.view(np.uint8)[::8]
Out[239]: array([3, 5, 2], dtype=uint8)

In [240]: np.shares_memory(A,A.view(np.uint8)[::8])
Out[240]: True

So, when we use np.unpackbits, we are feeding in the same number of elements as the original one.

Also, A.view(np.uint8)[::8] seems like a good trick to view an int dtype array as an uint8 one!

To solve for generic case, we could extend the earlier listed approaches.

Approach #1 (for ln upto 63) :

(((np.abs(A)[:,None] & (1 << np.arange(ln)))!=0)*np.sign(A)[:,None]).ravel()

Approach #2 :

a = np.abs(A)
m = ((ln-1)//8)+1
b = a.view(np.uint8).reshape(-1,8)[:,:m]
U = np.unpackbits(b,axis=1)
out = U.reshape(-1,m,8)[...,::-1].reshape(len(A),-1)[...,:ln]
out = (out*np.sign(A)[:,None]).ravel()

Upvotes: 2

Jacques Gaudin

Reputation: 16998

You can do so by using binary_repr:

arr = np.array([3,5,2])
res = np.array([int(c) for i in arr for c in np.binary_repr(i, width=6)[::-1]])

>>>[1 1 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0]

The [::-1] is a trick to iterate through the string in reverse order: the step of the iteration is set to -1. For more details refer to the extended slices docs.

Or with a format string (it starts to look like code golf though):

res = np.array([int(c) for i in arr for c in f'{i:06b}'[::-1]])

f'{i:06b}' is a string representing i in binary with 6 digits and leading zeros.

Speed-wise, this is very slow... Sorry I didn't get that bit of the question!

Upvotes: 2

How to convert an integer array to a specific length binary array

Answers (3)

Related Questions