Gordon Hui
Gordon Hui

Reputation: 375

Encoding Emojis from text which includes escape sequence

I am trying to print some text with emojis from this form text = "\\ud83d\\ude04\\n\\u3082\\u3042", into:

# my expecting output
# a new line after the emoji, then is Japanese character
>>>😄
もあ

I have read a question about this, but just solve part of the problem:

Best and clean way to Encode Emojis (Python) from text file

I followed the code mentioned in the post, and I got below result:

emoji_text = "\\ud83d\\ude04\\n\\u3082\\u3042".encode("latin_1")
output = (emoji_text
  .decode("raw_unicode_escape")
  .encode('utf-16', 'surrogatepass')
  .decode('utf-16')
)
print(output)

>>>😄\nもあ
# it prints \n instead of a new line

Therefore, I would like to ask, how can I convert the escape sequences \n, \t, \b etc. while converting the emoji and text?

Upvotes: 0

Views: 743

Answers (1)

tripleee
tripleee

Reputation: 189387

Using unicode_escape instead of raw_unicode_escape will decode the \n as well. Though if there is a reason you used raw_unicode_escape in the first place, perhaps this will not be suitable?

Your choice to encode into "latin-1" is vaguely odd, but perhaps there is a reason for that, too. Perhaps you should encode into "ascii" and be prepared to cope with any possible fallout.

Upvotes: 1

Related Questions