Tom Xue
Tom Xue

Reputation: 3355

Why I cannot save file with Chinese characters when using Python 2.7.11 IDLE?

I just downloaded the latest Python 2.7.11 64bit from its official website and installed it to my Windows 10. And I found that if the new IDLE file contains Chinese character, like 你好, then I cannot save the file. If I tried to save it for several times, then the new file crashed and disappeared.

I also installed the latest python-3.5.1-amd64.exe, and it does not have this issue.

How to solve it?

More: A example code from wiki page, https://zh.wikipedia.org/wiki/%E9%B8%AD%E5%AD%90%E7%B1%BB%E5%9E%8B

If I past the code here, StackOverflow alays warn me: Body cannot contain "I just dow". Why?

Thanks!

enter image description here

More: I find this config option, but it does not help at all. IDLE -> Options -> Configure IDLE -> General -> Default Source Encoding: UTF-8

More: By adding u before the Chinese code, everything will be right, it is great way. Like below: enter image description here

Without u there, sometimes it will go with corrupted code. Like below: enter image description here

Upvotes: 5

Views: 2982

Answers (4)

Julian Eccleshall
Julian Eccleshall

Reputation: 564

even in python 3.7 I still experience the same issue, UTF-8 still does the trick

Upvotes: 0

Mikhail Batcer
Mikhail Batcer

Reputation: 2065

When using Python 2 on Windows:

  1. For file with Unicode characters to be saved in IDLE, a line

    # -*- coding: utf-8 -*-
    

    has to be added in its beginning.

  2. And for Unicode characters to show correctly in console output in Windows, if running a script, saved in a file, in IDLE console or in Windows shell, strings have to be prepended with u:

    print u"你好"
    print u"Привет"
    

    But in interactive mode, I discovered no need for this with cyrillic.

Upvotes: 0

Sean Francis N. Ballais
Sean Francis N. Ballais

Reputation: 2488

Python 2 uses ASCII as its default encoding for its strings which cannot store Chinese characters. On the other hand, Python 3 uses Unicode encoding for its strings by default which can store Chinese characters.

But that doesn't mean Python 2 cannot use Unicode strings. You just have to encode your strings into Unicode. Here's an example of converting your strings to Unicode strings.

>>> plain_text = "Plain text"
>>> plain_text
'Plain text'
>>> utf8_text = unicode(plain_text, "utf-8")
>>> utf8_txt
u'Plain_text'

The prefix u in the string, utf8_txt, says that it is a Unicode string.

You could also do this.

>>> print u"你好"
>>> 你好

You just have to prepend your string with u to signify that it is a Unicode string.

Upvotes: 2

Tales Pádua
Tales Pádua

Reputation: 1461

Python 2.x uses ASCII as default encoding, while Python 3.x uses UTF-8. Just use:
my_string.encode("utf-8")
to convert ascii to utf-8 (or change it to any other encoding you need)

You can also try to put this line on the first line of your code:

# -*- coding: utf-8 -*-

Upvotes: 2

Related Questions