Reputation: 22252
I'm doing some serial protocol stuff and want to implement a basic byte stuffing algorithm in python. I'm struggling to determine what is the most pythonic way to do it.
Byte stuffing is basically just replacing any "reserved" bytes with a pair composed of an escape byte and the original byte transformed in a reversible way (e.g. xor'ed).
So far, I've come with 5 different approaches, and each of them has something I don't like about it:
def stuff1(bits):
for byte in bits:
if byte in _EscapeCodes:
yield PacketCode.Escape
yield byte ^ 0xFF
else:
yield byte
This may be my favorite, but maybe just because I'm kind of fascinated by yield based generators. I worried that the generator would make it slow, but it's actually the second fastest of the bunch.
def stuff2(bits):
result = bytes()
for byte in bits:
if byte in _EscapeCodes:
result += bytes([PacketCode.Escape, byte ^ 0xFF])
else:
result += bytes([byte])
return result
Constantly has to create single element arrays just to throw them out because I'm not aware of any "copy with one additional element" operation. It ties for the slowest of the bunch.
def stuff3(bits):
result = bytearray()
for byte in bits:
if byte in _EscapeCodes:
result.append(PacketCode.Escape)
result.append(byte ^ 0xFF)
else:
result.append(byte)
return result
Seems better than the direct bytes()
approach. Actually slower than the yield generator and can do one byte at a time (instead of needing intermediate 1 element collections). But it feels brutish. It's middle of the pack performance.
def stuff4(bits):
bio = BytesIO()
for byte in bits:
if byte in _EscapeCodes:
bio.write(bytes([PacketCode.Escape, byte ^ 0xFF]))
else:
bio.write(bytes([byte]))
return bio.getbuffer()
I like the stream based approach here. But it is annoying that there doesn't seem to be something like a write1
API that could just add 1 byte, so I have to make those intermediate bytes
again. If there was a "write single byte", I'd like this one. It ties for slowest.
def stuff5(bits):
escapeStuffed = bytes(bits).replace(bytes([PacketCode.Escape]), bytes([PacketCode.Escape, PacketCode.Escape ^ 0xFF]))
stopStuffed = escapeStuffed.replace(bytes([PacketCode.Stop]), bytes([PacketCode.Escape, PacketCode.Stop ^ 0xFF]))
return stopStuffed.replace(bytes([PacketCode.Start]), bytes([PacketCode.Escape, PacketCode.Start ^ 0xFF]))
This is the fastest. But I don't like the way the code reads and the intermediate sweeps.
I tried additionally to use translate()
, but AFAICT, it can only translate 1:1 sequences.
Upvotes: 6
Views: 1240