Reputation:
I'm looking to unpack from a buffer a string and its length.
For example to obtain (4, 'Gégé')
from this buffer :
b'\x00\x04G\xE9g\xe9'
Does someone know how to do ?
Upvotes: 0
Views: 343
Reputation: 55469
The length data looks like a big-endian unsigned 16 bit integer, and the string data looks like it's using the Latin1 encoding. If that's correct, you can extract it like this:
from struct import unpack
def extract(buff):
return unpack(b'>H', buff[:2])[0], buff[2:].decode('latin1')
buff = b'\x00\x04G\xE9g\xe9'
print(extract(buff))
output
(4, 'Gégé')
Another possibility for the encoding is the old Windows code page 1252, which can be decoded using .decode('cp1252')
.
The above code works in both Python 2 & Python 3. But in Python 3 there's an easier way: we don't need struct.unpack
, we can use the int.from_bytes
method.
def extract(buff):
return int.from_bytes(buff[:2], 'big'), buff[2:].decode('latin1')
Upvotes: 4