Reputation: 1569
I have a unicode string as a result : u'splunk>\xae\uf001'
How can I get the substring 'uf001'
as a simple string in python?
Upvotes: 2
Views: 4760
Reputation: 414865
u''
it is how a Unicode string is represented in Python source code. REPL uses this representation by default to display unicode objects:
>>> u'splunk>\xae\uf001'
u'splunk>\xae\uf001'
>>> print(u'splunk>\xae\uf001')
splunk>®
>>> print(u'splunk>\xae\uf001'[-1])
If your terminal is not configured to display Unicode or if you are on a narrow build (e.g., it is likely for Python 2 on Windows) then the result may be different.
Unicode string is an immutable sequence of Unicode codepoints in Python. len(u'\uf001') == 1
: it does not contain uf001
(5 characters) in it. You could write it as u''
(it is necessary to declare the character encoding of your source file on Python 2 if you use non-ascii characters):
>>> u'\uf001' == u''
True
It is just a different way to represent exactly the same Unicode character (a single codepoint in this case).
Note: some user-perceived characters may span several Unicode codepoints e.g.:
>>> import unicodedata
>>> unicodedata.normalize('NFKD', u'ё')
u'\u0435\u0308'
>>> print(unicodedata.normalize('NFKD', u'ё'))
ё
Upvotes: 1
Reputation: 91009
Since you want the actual string (as seen from comments) , just get the last character [-1] index
, Example -
>>> a = u'splunk>\xae\uf001'
>>> print(a)
splunk>®ï€
>>> a[-1]
'\uf001'
>>> print(a[-1])
ï€
If you want the unicode representation (\uf001
) , then take repr(a[-1])
, Example -
>>> repr(a[-1])
"'\\uf001'"
\uf001
is a single unicode character (not multiple strings) , so you can directly get that character as above.
You see \uf001
because you are checking the results of repr()
on the string, if you print it, or use it somewhere else (like for files, etc) it will be the correct \uf001
character.
Upvotes: 2
Reputation: 198526
The characters uf001
are not actually present in the string, so you can't just slice them off. You can do
repr(s)[-6:-1]
or
'u' + hex(ord(s[-1]))[2:]
Upvotes: 2