Samarth D M
Samarth D M

Reputation: 35

Decoding special characters bytes with double back slashes

x = 'Amazing!! Fantastic coffee and cakes to die for \\xf0\\x9f\\x98\\x80\\xf0\\x9f\\x98\\x80'

y = 'There is a line outside the door for good reason! Everything (and there is plenty) is absolutely excellent. As long as you\\xe2\\x80\\x99re a fan of the best freshly made French baked goods and enjoy an amazing croque monsieur, you\\xe2\\x80\\x99ll be happy here!'

How do you decode these strings and store in a flat file to show all emojis and other special characters in csv instead of bytes

Upvotes: 0

Views: 174

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195528

It seems you have x and y as type str.

You can do this to decode the strings:

x = 'Amazing!! Fantastic coffee and cakes to die for \xf0\x9f\x98\x80\xf0\x9f\x98\x80'
y = 'There is a line outside the door for good reason! Everything (and there is plenty) is absolutely excellent. As long as you\xe2\x80\x99re a fan of the best freshly made French baked goods and enjoy an amazing croque monsieur, you\xe2\x80\x99ll be happy here!'

print(bytearray([ord(c) for c in x]).decode('utf-8'))
print(bytearray([ord(c) for c in y]).decode('utf-8'))

Prints:

Amazing!! Fantastic coffee and cakes to die for 😀😀
There is a line outside the door for good reason! Everything (and there is plenty) is absolutely excellent. As long as you’re a fan of the best freshly made French baked goods and enjoy an amazing croque monsieur, you’ll be happy here!

EDIT: If you have string with double backlashes, you can use ast.literal_eval to decode it:

x = 'Amazing!! Fantastic coffee and cakes to die for \\xf0\\x9f\\x98\\x80\\xf0\\x9f\\x98\\x80'
y = 'There is a line outside the door for good reason! Everything (and there is plenty) is absolutely excellent. As long as you\\xe2\\x80\\x99re a fan of the best freshly made French baked goods and enjoy an amazing croque monsieur, you\\xe2\\x80\\x99ll be happy here!'

from ast import literal_eval

print(literal_eval('b"""' + x + '"""').decode('utf-8'))
print(literal_eval('b"""' + y + '"""').decode('utf-8'))

Upvotes: 1

Related Questions