Reputation: 49
This is not about decoding using UTF-8. This is about reading a bytes object as a literal and needing it as a bytes object without reinventing the parsing process. If there is an answer to my question out there, it is hiding behind a lot of answers to questions about decoding.
Here is what I need:
x = "bytearray(b'abc\xd8\xa2\xd8\xa8\xd8\xa7xyz')"
y = ???(x, ???)
z = bytearray(b'abc\xd8\xa2\xd8\xa8\xd8\xa7xyz')
if y == z:
print ("Yes!")
Any suggestions for how to replace those question marks?
Thanks!
-- Dave
Upvotes: 2
Views: 1550
Reputation: 5866
One approach would be to remove all the clutter from x (bytearray(b'
and ')
), then we just convert each character to its byte representation and wrap it into a bytearray
object.
x = "bytearray(b'abc\xd8\xa2\xd8\xa8\xd8\xa7xyz')"
y = bytes(ord(c) for c in x[12:-2])
The second approach below won't be limited to bytearray
and you should use it with care to protect against injection, but if you are sure that your content is in the correct format you can use this:
x = r"bytearray(b'abc\xd8\xa2\xd8\xa8\xd8\xa7xyz')"
y = eval(x)
z = bytearray(b'abc\xd8\xa2\xd8\xa8\xd8\xa7xyz')
Here you need to prefix x
with r"..."
to prevent the backslashes from immediately inserting unicode sequences into x. Therefore, it might not be possible to use this with dynamic content, e.g. strings coming from standard input or read from files.
You can also use ast.literal_eval(x[10:-1])
as suggested by kindall.
Upvotes: 3