Reputation: 43
I have the following string "◣⛭◣◃✺▲♢"
and I want to make that string into "\u25E3\u26ED\u25E3\u25C3\u273A\u25B2\u2662"
. Exactly the same as this site does https://mothereff.in/js-escapes
I was wondering if this is possible in python. I have tried allot of stuff from the unicode docs for python but failed miserably.
Example of what I tried before:
#!/usr/bin/env python
# -*- coding: latin-1 -*-
f = open('js.js', 'r').read()
print(ord(f[:1]))
help would be appreciated!
Upvotes: 4
Views: 2649
Reputation: 46423
If you're in python 2, then I'd suspect you're getting something like this:
>>> s = "◣⛭◣◃✺▲♢"
>>> s[0]
'\xe2'
To get to the unicode code points in a UTF-8 encoded file (or buffer), you'll need to decode it into a python unicode object first (otherwise you'll see the bytes that make up the UTF-8 encoding).
>>> s_utf8 = s.decode('utf-8')
>>> s_utf8[0]
u'\u25e3'
>>> ord(s_utf8[0])
9699
>>> hex(ord(s_utf8[0]))
'0x25e3'
In your case, you can go straight from the ord() to a literal unicode escape with something like this:
>>> "\\u\x" % (ord(s_utf8[0]))
'\\u25e3'
Or convert the entire string in one go with a list comprehension:
>>> ''.join(["\\u%04x" % (ord(c)) for c in s_utf8])
'\\u25e3\\u26ed\\u25e3\\u25c3\\u273a\\u25b2\\u2662'
Of course, when you're doing the conversion this way, you're going to display the code points for all the characters in the string. You'll have to decide which code points to show, or the ABCs will be escaped too:
>>> ''.join(["\\u%04x" % (ord(c)) for c in u"ABCD"])
'\\u0041\\u0042\\u0043\\u0044'
Or, just use georg's suggestion to let python figure all that out for you.
Upvotes: 0
Reputation: 6331
Considering you're using Python 3:
unicode_string="◣⛭◣◃✺▲♢"
byte_string= unicode_string.encode('ascii', 'backslashreplace')
print(byte_string)
See codecs module documentation for more infotmation.
However, to work with JavaScript notation, there's a special module json, and then you could achieve the same thing:
import json
unicode_string="◣⛭◣◃✺▲♢"
json_string=json.dumps(unicode_string)
print(json_string)
Upvotes: 4