bigturtle
bigturtle

Reputation: 371

How to make print() output UTF-8 in Python 3.0?

I'm working in WinXP 5.1.2600, writing a Python application involving Chinese pinyin, which has involved me in endless Unicode problems. Switching to Python 3.0 has solved many of them. But the print() function for console output is not Unicode-aware for some odd reason. Here's a teeny program.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
    
import sys

print('sys.stdout encoding is "' + sys.stdout.encoding + '"')
str1 = 'lüelā'
print(str1)

Output is (changing angle brackets to square brackets for readability):

    sys.stdout encoding is "cp1252"
    Traceback (most recent call last):
      File "TestPrintEncoding.py", line 22, in [module]
        print(str1)
      File "C:\Python30\lib\io.py", line 1491, in write
        b = encoder.encode(s)
      File "C:\Python30\lib\encodings\cp1252.py", line 19, in encode
        return codecs.charmap_encode(input,self.errors,encoding_table)[0]
    UnicodeEncodeError: 'charmap' codec can't encode character '\u0101' 
    in position 4: character maps to [undefined]

Note that ü = '\xfc' = 252 gives no problem since it's upper ASCII. But ā = '\u0101' is beyond 8 bits.

Anyone have an idea how to change the encoding of sys.stdout to 'utf-8'? Bear in mind that Python 3.0 no longer uses the codecs module, if I understand the documentation right.


(Note that the coding specified by the "coding:" line is the coding of the source code, not of the console output. But thank you for your thoughts!)

Upvotes: 18

Views: 21651

Answers (5)

Adam Bartoš
Adam Bartoš

Reputation: 717

The problem of displaying Unicode charaters in Python in Windows is known. There is no official solution yet. The right thing to do is to use winapi function WriteConsoleW. It is nontrivial to build a working solution as there are other related issues. However, I have developed a package which tries to fix Python regarding this issue. See https://github.com/Drekin/win-unicode-console. You can also read there a deeper explanation of the problem. The package is also on pypi (https://pypi.python.org/pypi/win_unicode_console) and can be installed using pip.

Upvotes: 1

Adobe
Adobe

Reputation: 13467

Here's a dirty hack:

# works
import os
os.system("chcp 65001 &")
print("юникод")

However everything breaks it:

  • simple muting first line already breaks it:

    # doesn't work
    import os
    os.system("chcp 65001 >nul &")
    print("юникод")
    
  • checking for OS type breaks it:

    # doesn't work
    import os
    if os.name == "nt":
        os.system("chcp 65001 &")
    
    print("юникод")
    
  • it doesn't even works under if block:

    # doesn't work
    import os
    if os.name == "nt":
        os.system("chcp 65001 &")
        print("юникод")
    

But one can print with cmd's echo:

# works
import os
os.system("chcp 65001 & echo {0}".format("юникод"))

and here's a simple way to make this cross-platform:

# works

import os

def simple_cross_platrofm_print(obj):
    if os.name == "nt":
        os.system("chcp 65001 >nul & echo {0}".format(obj))
    else:
        print(obj)

simple_cross_platrofm_print("юникод")

but the window's echo trailing empty line can't be suppressed.

Upvotes: 1

daveagp
daveagp

Reputation: 2669

You may want to try changing the environment variable "PYTHONIOENCODING" to "utf_8." I have written a page on my ordeal with this problem.

Upvotes: 12

itsadok
itsadok

Reputation: 29342

Check out the question and answer here, I think they have some valuable clues. Specifically, note the setdefaultencoding in the sys module, but also the fact that you probably shouldn't use it.

Upvotes: 2

Brandon
Brandon

Reputation: 3764

The Windows command prompt (cmd.exe) cannot display the Unicode characters you are using, even though Python is handling it in a correct manner internally. You need to use IDLE, Cygwin, or another program that can display Unicode correctly.

See this thread for a full explanation: http://www.nabble.com/unable-to-print-Unicode-characters-in-Python-3-td21670662.html

Upvotes: 15

Related Questions