rainkin
rainkin

Reputation: 7

escape a string which contains non-ascii

now I have string s = "\\u653e"

I want to convert this string into s = "\u653e"

I try to make it clear:

# this is what I want
>>s
>>'\u653e'
# this is not what I want, print will escape the string automatically 
>>print s
>>\653e

how can I do that?


the original question is that

I have a string s = u'\u653e', [s] = [u'\u653e'] So I want to remove the u, that is, [s] = ['\u653e']

so I just use the command ast.literal_eval(json.dumps(r)) to get the above string "\\u653e"


UPDATE Thanks tdelaney

Creating a string from an entire list causes my problem. What I should to do is using a unicode string to start with and build the list from its individual elements instead of the entire list. For more details you can see his answer.

Upvotes: 0

Views: 157

Answers (1)

tdelaney
tdelaney

Reputation: 77337

s is a single unicode character. "\u653e is a literal encoding that python uses to express unicode characters in ascii text. The unicode_escape codec converts between these types.

>>> s = u'\u653e'
>>> print type(s), len(s), s
<type 'unicode'> 1 放
>>> encoded = s.encode('unicode_escape')
>>> print type(encoded), len(encoded), encoded
<type 'str'> 6 \u653e

In your example just do

s = u'\u653e'
somelist = [s.encode('unicode_escape')]
>>> print somelist
['\\u653e']
>>> print somelist[0]
\u653e

update

From your comments, your problem may be how you create your command string. There seems to be a problem with the python representation of a string verses the string itself. Use a unicode string to start with and build the list from its individual elements instead of the entire list.

>>> excel = [u'\u4e00', u'\u4e8c', u'\u4e09']
>>> cmd = u'create vertex v set s = [{}]'.format(u','.join(excel))
>>> cmd
u'create vertex v set s = [\u4e00,\u4e8c,\u4e09]'
>>> print cmd
create vertex v set s = [一,二,三]

Upvotes: 1

Related Questions