SkyFox
SkyFox

Reputation: 1875

Split and non-latin strings in Python 2.X

Example:

# -*- coding: utf-8 -*-
my_str = u'Строка ^ с ^ разделителями!' # Russian letters
print my_str.replace(' ', '')
print my_str.replace(' ', '').split('^')

Result:

Строка^с^разделителями!
[u'\u0421\u0442\u0440\u043e\u043a\u0430 ', u' \u0441 ', u' \u0440\u0430\u0437\u0434\u0435\u043b\u0438\u0442\u0435\u043b\u044f\u043c\u0438!']

Please, help. How can I show 'normal' strings after splitting?

P.S. File-script encoding is utf8

Upvotes: 1

Views: 1141

Answers (1)

Tim Pietzcker
Tim Pietzcker

Reputation: 336208

These are normal strings, you're just seeing their internal representation (because you're not printing a string, you're printing a list in the second example). Do

for s in my_str.replace(' ', '').split('^'):
    print s

and you'll see. Conversely, try

print repr(my_str.replace(' ', ''))

and see what happens then.

Upvotes: 4

Related Questions