Reputation: 1618
Specifically, I am receiving a stream of bytes from a TCP socket that looks something like this:
inc_tcp_data = b'\x02hello\x1cthisisthedata'
The stream using hex values to denote different parts of the incoming data. However I want to use the inc_data in the following format:
converted_data = '\x02hello\x1cthisisthedata'
essentially I want to get rid of the b and just literally spit out what came in.
I've tried various struct.unpack methods as well as .decode("encoding). I could not get the former to work at all, and the latter would strip out the hex values if there was no visual way to encode it or it would convert it to character if it could. Any ideas?
Update:
I was able to get my desired result with the following code:
inc_tcp_data = b'\x02hello\x3Fthisisthedata'.decode("ascii")
d = repr(inc_tcp_data)
print(d)
print(len(d))
print(len(inc_tcp_data))
the output is:
'\x02hello?thisisthedata'
25
20
however, this still doesn't help me because I do actually need the regular expression that follows to see \x02 as a hex value and not as a 4 byte string.
what am I doing wrong?
UPDATE
I've solved this issue by not solving it. The reason I wanted the hex characters to remain unchanged was so that a regular expression would be able to detect it further down the road. However what I should have done (and did) was simply change the regular expression to analyze the bytes without decoding it. Once I had separated out all the parts via regular expression, I decoded the parts with .decode("ascii")
and everything worked out great.
I'm just updating this if it happens to help someone else.
Upvotes: 0
Views: 2943
Reputation: 40624
Assuming you are using python 3
>>> inc_tcp_data.decode('ascii')
'\x02hello\x1cthisisthedata'
Upvotes: 1