kev
kev

Reputation: 161604

Why python IDLE and Console produce different results

I write a simple Python script to translate Chinese Punctuation to English.

import codecs, sys

def trcn():
    tr = lambda x: x.translate(str.maketrans(""",。!?;:、()【】『』「」﹁﹂“”‘’《》~¥…—×""", """,.!?;:,()[][][][]""''<>~$^-*"""))
    out = codecs.getwriter('utf-8')(sys.stdout)
    for line in sys.stdin:
        out.write(tr(line))

if __name__ == '__main__':
    if not len(sys.argv) == 1:
        print("usage:\n\t{0} STDIN STDOUT".format(sys.argv[0]))
        sys.exit(-1)
    trcn()
    sys.exit(0)

But something is wrong with UNICODE. I cannot get it passed. Error msg:

Traceback (most recent call last):
  File "trcn.py", line 13, in <module>
    trcn()
  File "trcn.py", line 7, in trcn
    out.write(tr(line))
  File "C:\Python31\Lib\codecs.py", line 356, in write
    self.stream.write(data)
TypeError: must be str, not bytes

After then, I test the out.write() in IDLE and Console. They produced different results. I don't know why.

In IDLE

Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import sys,codecs
>>> out = codecs.getwriter('utf-8')(sys.stdout)
>>> out.write('hello')
hello
>>>

In Console

Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys,codecs
>>> out = codecs.getwriter('utf-8')(sys.stdout)
>>> out.write('hello')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python31\Lib\codecs.py", line 356, in write
    self.stream.write(data)
TypeError: must be str, not bytes
>>>

Platform: Windows XP EN

Upvotes: 3

Views: 1045

Answers (3)

Lennart Regebro
Lennart Regebro

Reputation: 172179

IDLE redirects the stdout to its own GUI output. It apparently accepts bytes as well as strings, which normal stdout doesn't.

Either decode it to Unicode, or print it to sys.stdout.buffer.

Upvotes: 1

Gnu Engineer
Gnu Engineer

Reputation: 1545

It is very well obvious that the console's encoding is not utf-8. there is a way to specify the encoding as optional parameter when invoking python in console. just look for it in python docs.

Upvotes: -1

Greg Hewgill
Greg Hewgill

Reputation: 992717

Your encoded output is coming out of the encoder as bytes, and therefore must be passed to sys.stdout.buffer:

out = codecs.getwriter('utf-8')(sys.stdout.buffer)

I'm not entirely sure why your code acts differently in IDLE versus the console, but the above may help. Perhaps IDLE's sys.stdout actually expects bytes instead of characters (hopefully it has a .buffer that also expects bytes).

Upvotes: 6

Related Questions