Reading bytes in Python - io.BytesIO v/s binascii.unhexlify

Question

What is the difference between below 2 ways of reading bytes?

stream = BytesIO(unhexlify('000000010000'))
print(int.from_bytes(stream.read(4), byteorder="big"))  //prints 1


bytes = unhexlify('000000010000')
print(int.from_bytes(bytes[:4], byteorder="big"))  //prints 1

which is better? why?

Moinuddin Quadri · Accepted Answer

If you know that your string is the hex string, why not directly convert it into int with base as "16" after slicing it? For example:

>>> my_hex = '000000010000'
>>> int(my_hex[:8], base=16)
1

You must note here that I am slicing the string with index as "8" instead of 4, but since we know that it is a hex string, we can slice it accordingly considering 2 characters represents a hex number.

Here's the performance comparison of your's as well as mine solution:

mquadri$ python3 -m timeit "my_hex = '000000010000'; int(my_hex[:8], base=16)"
1000000 loops, best of 3: 0.581 usec per loop

mquadri$ python3 -m timeit -s "from io import BytesIO; from binascii import unhexlify" "stream = BytesIO(unhexlify('000000010000')); int.from_bytes(stream.read(4), byteorder='big')"
1000000 loops, best of 3: 1.15 usec per loop

mquadri$ python3 -m timeit -s "from binascii import unhexlify" "bytes = unhexlify('000000010000'); int.from_bytes(bytes[:4], byteorder='big')"
1000000 loops, best of 3: 0.764 usec per loop

As you see, simply using int to convert is more efficient than both of your solutions.

However if you are interested only in the solutions you mentioned, then I'll suggest the one without using io.BytesIO because:

without using ByteIO, you'll requires one lesser imports
comparatively your second solution looks simpler too

Note: For the performance measurement, I am not calculating the time of imports, in case someone is planning to say that "this difference is related to additional import" ;)

Reading bytes in Python - io.BytesIO v/s binascii.unhexlify

Answers (2)

Related Questions