Reputation: 103
I want to get a string of origin bytes (assemble code) without encoding to another encoding. As the content of bytes is shellcode, I do not need to encode it and want to write it directly as raw bytes. By simplify, I want to convert "b'\xb7\x00\x00\x00'" to "\xb7\x00\x00\x00" and get the string representation of raw bytes. For example:
>> byte_code = b'\xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00'
>> uc_str = str(byte_code)[2:-1]
>> print(byte_code, uc_str)
b'\xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00' \xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00
Currently I have only two ugly methods,
>> uc_str = str(byte_code)[2:-1]
>> uc_str = "".join('\\x{:02x}'.format(c) for c in byte_code)
Raw bytes usage:
>> my_template = "const char byte_code[] = 'TPL'"
>> uc_str = str(byte_code)[2:-1]
>> my_code = my_template.replace("TPL", uc_str)
# then write my_code to xx.h
Is there any pythonic way to do this?
Upvotes: 1
Views: 1299
Reputation: 41
I came across this trying to do something similar with some SNMP code.
byte_code = b'\xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00'
text = byte_code.decode('raw_unicode_escape')
writer_func(text)
It worked to send an SNMP Hex string as an OctetString when there was no helper support for hex.
See also standard-encodings and bytes decode
and for anyone looking at the SNMP Set Types
Upvotes: 0
Reputation: 362707
Your first method is broken, because any bytes that can be represented as printable ASCII will be, for example:
>>> str(b'\x00\x20\x41\x42\x43\x20\x00')[2:-1]
'\\x00 ABC \\x00'
The second method is actually okay. Since this feature appears to be missing from stdlib I've published all-escapes which provides it.
pip install all-escapes
Example usage:
>>> b"\xb7\x00\x00\x00".decode("all-escapes")
'\\xb7\\x00\\x00\\x00'
Upvotes: 2
Reputation: 700
basic of conversion byte / str is this :
>>> b"abc".decode()
'abc'
>>>
or :
>>> sb = b"abc"
>>> s = sb.decode()
>>> s
'abc'
>>>
The inverse is :
>>> "abc".encode()
b'abc'
>>>
or :
>>> s="abc"
>>> sb=s.encode()
>>> sb
b'abc'
>>>
And in your case, you should use errors argument :
>>> b"\xb7".decode(errors="replace")
'�'
>>>
Upvotes: -1