Reputation: 1
I am running Active Perl 5.16.3 on Windows 7 (32 bits).
My (short) program massages an input text file (encoded in UTF-8). I wish the output encoding to be in Latin1, so my code is:
open (OUT, '>;encoding(Latin1)', "out.txt") || die "Cannot open output file: $!\n";
print OUT "$string\n";
yet the resulting file is still in UTF-8. What am I doing wrong?
Upvotes: 0
Views: 500
Reputation: 57600
Firstly, the encoding layer is separated from the open mode by a colon, not a semicolon.
open OUT, '>:encoding(latin1)', "out.txt" or die "Cannot open output file: $!\n";
Secondly, Latin-1 can only encode a small subset of UTF-8. Furthermore, most of this subset is encoded the same in both encodings. We therefore have to use a test file with characters that are not encoded the same, e.g. \N{MULTIPLICATION SIGN}
U+00D7 ×
, which is \xD7
in Latin-1, and \xC3\x97
in UTF-8.
Make also sure that you actually decode the input file.
Here is how you could generate the test file:
$ perl -CSA -E'say "\N{U+00D7}"' > input.txt
Here is how you can test that you are properly recoding the file:
use strict;
use warnings;
use autodie;
open my $in, "<:encoding(UTF-8)", "input.txt";
open my $out, ">:encoding(latin1)", "output.txt";
while (<$in>) {
print { $out } $_;
}
The input.txt
and output.txt
should afterwards have different lengths (3 bytes → 2 bytes).
Upvotes: 2