Reputation: 103

Python3 How to get raw bytes string without encode?

I want to get a string of origin bytes (assemble code) without encoding to another encoding. As the content of bytes is shellcode, I do not need to encode it and want to write it directly as raw bytes. By simplify, I want to convert "b'\xb7\x00\x00\x00'" to "\xb7\x00\x00\x00" and get the string representation of raw bytes. For example:

>> byte_code = b'\xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00'
>> uc_str = str(byte_code)[2:-1] 
>> print(byte_code, uc_str)
b'\xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00' \xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00

Currently I have only two ugly methods,

>> uc_str = str(byte_code)[2:-1]
>> uc_str = "".join('\\x{:02x}'.format(c) for c in byte_code)

Raw bytes usage:

>> my_template = "const char byte_code[] = 'TPL'"
>> uc_str = str(byte_code)[2:-1]
>> my_code = my_template.replace("TPL", uc_str)
# then write my_code to xx.h

Is there any pythonic way to do this?

Upvotes: 1

Answers (3)

Ben Fischer

Reputation: 41

I came across this trying to do something similar with some SNMP code.

byte_code = b'\xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00'
text = byte_code.decode('raw_unicode_escape')
writer_func(text)

It worked to send an SNMP Hex string as an OctetString when there was no helper support for hex.

See also standard-encodings and bytes decode

and for anyone looking at the SNMP Set Types

Upvotes: 0

wim

Reputation: 362707

Your first method is broken, because any bytes that can be represented as printable ASCII will be, for example:

>>> str(b'\x00\x20\x41\x42\x43\x20\x00')[2:-1]
'\\x00 ABC \\x00'

The second method is actually okay. Since this feature appears to be missing from stdlib I've published all-escapes which provides it.

pip install all-escapes

Example usage:

>>> b"\xb7\x00\x00\x00".decode("all-escapes")
'\\xb7\\x00\\x00\\x00'

Upvotes: 2

Emmanuel DUMAS

Reputation: 700

basic of conversion byte / str is this :

>>> b"abc".decode()
'abc'
>>>

or :

>>> sb = b"abc"
>>> s = sb.decode()
>>> s
'abc'
>>>

The inverse is :

>>> "abc".encode()
b'abc'
>>>

or :

>>> s="abc"
>>> sb=s.encode()
>>> sb
b'abc'
>>>

And in your case, you should use errors argument :

>>> b"\xb7".decode(errors="replace")
'�'
>>>

Upvotes: -1

Python3 How to get raw bytes string without encode?

Answers (3)

Related Questions