lch
lch

Reputation: 4931

Reading bytes in Python - io.BytesIO v/s binascii.unhexlify

What is the difference between below 2 ways of reading bytes?

stream = BytesIO(unhexlify('000000010000'))
print(int.from_bytes(stream.read(4), byteorder="big"))  //prints 1


bytes = unhexlify('000000010000')
print(int.from_bytes(bytes[:4], byteorder="big"))  //prints 1

which is better? why?

Upvotes: 0

Views: 1125

Answers (2)

Moinuddin Quadri
Moinuddin Quadri

Reputation: 48090

If you know that your string is the hex string, why not directly convert it into int with base as "16" after slicing it? For example:

>>> my_hex = '000000010000'
>>> int(my_hex[:8], base=16)
1

You must note here that I am slicing the string with index as "8" instead of 4, but since we know that it is a hex string, we can slice it accordingly considering 2 characters represents a hex number.

Here's the performance comparison of your's as well as mine solution:

mquadri$ python3 -m timeit "my_hex = '000000010000'; int(my_hex[:8], base=16)"
1000000 loops, best of 3: 0.581 usec per loop

mquadri$ python3 -m timeit -s "from io import BytesIO; from binascii import unhexlify" "stream = BytesIO(unhexlify('000000010000')); int.from_bytes(stream.read(4), byteorder='big')"
1000000 loops, best of 3: 1.15 usec per loop

mquadri$ python3 -m timeit -s "from binascii import unhexlify" "bytes = unhexlify('000000010000'); int.from_bytes(bytes[:4], byteorder='big')"
1000000 loops, best of 3: 0.764 usec per loop

As you see, simply using int to convert is more efficient than both of your solutions.


However if you are interested only in the solutions you mentioned, then I'll suggest the one without using io.BytesIO because:

  • without using ByteIO, you'll requires one lesser imports
  • comparatively your second solution looks simpler too

Note: For the performance measurement, I am not calculating the time of imports, in case someone is planning to say that "this difference is related to additional import" ;)

Upvotes: 4

Michael Ekoka
Michael Ekoka

Reputation: 20098

The point of using IO constructs (StringIO, BytesIO) is to work with objects that mimic a stream (like files). So your first solution is wrapping your bytes in a file-like object and reading from that wrapper as if it was a file. Your second solution just reads from the bytes.

I say if the semantics of your code do not require that the bytes be a stream, skip the IO solution go straight to the source.

Upvotes: 1

Related Questions