Anton
Anton

Reputation: 4815

Convert a string object with byte characters into a byte object?

I have a string like this:

text = 'b\'"Bill of the  one\\xe2\\x80\\x99s store wanted to go outside.\''

That is clearly meant to be byte formatted, however when I look at the object's type, it returns:

type(text)  
<class 'str'>

I tried encoding at byte and then decoding, but this was the result:

text.encode("utf-8").decode("utf-8")
'b\'"Bill of the oneâ\x80\x99s store wanted to go outside.\''

How can I get the text properly formatted?

Upvotes: 1

Views: 81

Answers (2)

brianpck
brianpck

Reputation: 8254

As another possible approach, it seems to me that the string you have is the result of calling repr on a byte object. You can reverse a repr by calling ast.literal_eval:

>>> import ast
>>> x = b'test string'
>>> y = repr(x)
>>> y
"b'test string'"
>>> ast.literal_eval(y)
b'test string'

Or in your case:

>>> x = 'b\'"Bill of the  one\\xe2\\x80\\x99s store wanted to go outside.\''
>>> import ast
>>> ast.literal_eval(x)
b'"Bill of the  one\xe2\x80\x99s store wanted to go outside.'

Upvotes: 3

Why are you doing both encode and decode on the string object if you do so you will anyhow come to the same state (i.e) string, just encode that is sufficient.

text = 'b\'"Bill of the  one\\xe2\\x80\\x99s store wanted to go outside.\''
type(text) #This will output <class 'str'>

Now, for byte object just make use of below snippet

byte_object=text.encode("utf-8")
type(byte_object) #This will output <class 'bytes'>

Upvotes: 1

Related Questions