Reputation: 11
I am trying to create file with Unicode character 662f on windows (via Perl or python, anything is fine for me ) . on Linux I am able to get chr 是 , but on windows I am getting this character 是 , and some how I am not able to get that file name as 是.
Python code -
import sys
name = unichr(0x662f)
print(name.encode('utf8').decode(sys.stdout.encoding))
perl code -
my $name .= chr(230).chr(152).chr(175); ##662f
print 'file name ::'. "$name"."txt";
Upvotes: 0
Views: 208
Reputation: 705
In Perl on Windows, I use Win32::Unicode
, Win32::Unicode::File
and Win32::Unicode::Dir
. They work perfectly with Unicode characters in file names.
Just mind that Win32::Unicode::File::open()
(and new()
) have a reversed argument order compared Perl's built-in open()
- mode comes first.
You do not need to encode the characters manually - just insert them as they are (if your Perl script is in UTF-8), or using the \x{N}
notation.
Printing Unicode into console on Windows is another problem. You can't use cmd.exe
. Instead use PowerShell ISE. The drawback of the ISE is that it's not a console - scripts can't take input from keyboard thru STDIN
.
To get Unicode output, you need to do set the output encoding to UTF-8 in every PowerShell ISE that's started. I suggest doing so in the startup script.
1) In order for any user PowerShell scripts to be allowed to run, you first need to do:
Set-ExecutionPolicy RemoteSigned
2) Edit or create your Documents\WindowsPowerShell\Microsoft.PowerShellISE_profile.ps1
to something like:
perl -w -e "print qq!Initializing the console with Perl...\n!;"
[System.Console]::OutputEncoding = [System.Text.Encoding]::UTF8;
The short Perl command is there as a trick to allow the System.Console
property be modified. Without it, you get an error when setting the OutputEncoding
.
If I recall correctly, you also have to change the font to Consolas.
Even when the Unicode characters print out fine, you may have trouble including them in command line arguments. In these cases I've found the \x{N}
notation works. The Windows Character Map utility is your friend here.
(Edited heavily after I rediscovered the regular PowerShell's inability to display most Unicode characters, with references to PowerShell (non-ISE) removed. Now I remember why I started using the ISE...)
Upvotes: 1