Reputation: 2639
I am on windows system.
I created two utf-8 file python_print.py
for python and perl_print.pl
for perl respectively, the two files contains same line as below
print("中")
and perl has ;
delimiter.
My CMD is in code page 936
by default, and I run
python python_print.py
I got
中
However, when I run
perl perl_print.pl
for the first time, it gives
涓
running it for the second time, I got
why??
I continue testing, I run chcp 65001
to change cmd encoding to utf-8, and this time, both python and perl gives correct "中"
.
Now I am completely confused, it seems that print in python and perl are quite different. It seems that perl alway print out utf8 bytes? and python print can detect CMD code page to print correct byte? Can somebody explain my test result?
Upvotes: 2
Views: 254
Reputation: 98398
perl is printing the literal bytes you have in your source file. It sees the string as "\xe4\xb8\xad" unless you explicitly declare that your source file is utf8 with use utf8;
.
Once you do that, you would (if you enabled warnings as you should) get a Wide character in print
warning; perl requires you to specify the encoding to be used when outputing non-ASCII characters. You can do that with use open ':std' => ':encoding(cp936)';
or with binmode STDOUT, ':encoding(cp936)';
or (for some filehandle you are opening) with the 3rd argument to open
.
Upvotes: 7