epsilon_j
epsilon_j

Reputation: 335

Extracting 2 bit integers from a string using Python

I am using python to receive a string via UDP. From each character in the string I need to extract the 4 pairs of bits and convert these to integers.

For example, if the first character in the string was "J", this is ASCII 0x4a or 0b01001010. So I would extract the pairs of bits [01, 00, 10, 10], which would be converted to [1, 0, 2, 2].

Speed is my number one priority here, so I am looking for a fast way to accomplish this.

Any help is much appreciated, thank you.

Upvotes: 2

Views: 863

Answers (2)

Paul Panzer
Paul Panzer

Reputation: 53079

You can use np.unpackbits

def bitpairs(a):
    bf = np.unpackbits(a)
    return bf[1::2] + (bf[::2]<<1)
    ### or: return bf[1::2] | (bf[::2]<<1) but doesn't seem faster

### small example
bitpairs(np.frombuffer(b'J', 'u1'))
# array([1, 0, 2, 2], dtype=uint8)

### large example
from string import ascii_letters as L
S = np.random.choice(array(list(L), 'S1'), 1000000).view('S1000000').item(0)
### one very long byte string
S[:10], S[999990:]
# (b'fhhgXJltDu', b'AQGTlpytHo')
timeit(lambda: bitpairs(np.frombuffer(S, 'u1')), number=1000)
# 8.226706639004988

Upvotes: 3

Brad Solomon
Brad Solomon

Reputation: 40908

You can slice the string and convert to int assuming base 2:

>>> byt = '11100100'
>>> [int(b, 2) for b in (byt[0:2], byt[2:4], byt[4:6], byt[6:8])]
[3, 2, 1, 0]

This assume that byt is always an 8 character str, rather than the int formed through the binary literal b11100100.

More generalized solution might look something like:

>>> def get_int_slices(b: str) -> list:
...     return [int(b[i:i+2], 2) for i in range(0, len(b), 2)]
... 
>>> get_int_slices('1110010011100100111001001110010011100100')
[3, 2, 1, 0, 3, 2, 1, 0, 3, 2, 1, 0, 3, 2, 1, 0, 3, 2, 1, 0]

The int(x, 2) calls says, "interpret the input as being in base 2."


*To my knowledge, none of my answers have ever won a speed race against Paul Panzer's, and this one is probably no exception.

Upvotes: 3

Related Questions