Samiron
Samiron

Reputation: 5317

Understanding lexical scoping of "use open ..." of Perl

use open qw( :encoding(UTF-8) :std );

Above statement seems to be effective in its lexical scope only and should not affect outside of it's scope. But I have observed the following.

$ cat data
€
#1
$ perl -e '
    open (my $fh, "<encoding(UTF-8)", "data");  
    print($_) while <$fh>;'
Wide character in print at -e line 1, <$fh> line 1.
€

The Wide character ... warning is perfect here. But

#2
$ perl
  my ($fh, $row);
  {
      use open qw( :encoding(UTF-8) :std );
      open ($fh, "<", "data");
  }
  $row = <$fh>;
  chomp($row);
  printf("%s (0x%X)", $row, ord($row));

  € (0x20AC)

Does not show the wide character warning!! Here is whats going on here imo

Now look at the following, a little variation

#3
my ($fh, $row);
{
    use open qw( :encoding(UTF-8) :std );
}
open ($fh, "<", "data");
$row = <$fh>;
chomp($row);
printf("%s (0x%X)", $row, ord($row));

⬠(0xE2)

Now this time since the open statement is out of the lexical scope, the open opened the file in non utf-8 mode.

Does this mean use open qw( :encoding(UTF-8) :std ); statement changes the STDOUT globally but STDIN within lexical scope?

Upvotes: 1

Views: 246

Answers (2)

H&#229;kon H&#230;gland
H&#229;kon H&#230;gland

Reputation: 40778

Unfortunately, the open qw(:std) pragma does not seem to behave as a lexical pragma since it changes the IO layers associated with the standard handles STDIN, STDOUT and STDERR globally. Even code earlier in source file is affected since the use statement happens at compile time. So the following

say join ":", PerlIO::get_layers(\*STDIN);
{
   use open qw( :encoding(UTF-8) :std );
}

prints ( on my linux platform ) :

unix:perlio:encoding(utf-8-strict):utf8

whereas without the use open qw( :encoding(UTF-8) :std ) it would just print unix:perlio.

A way to not affect the global STDOUT for example is to duplicate the handle within a lexical scope and then add IO layers to the duplicate handle within that scope:

use feature qw(say);
use strict;
use warnings;
use utf8;

my $str = "€";
say join ":", PerlIO::get_layers(\*STDOUT);
{
    open ( my $out, '>&STDOUT' ) or die "Could not duplicate stdout: $!";
    binmode $out, ':encoding(UTF-8)';
    say $out $str;
}
say join ":", PerlIO::get_layers(\*STDOUT);
say $str;

with output:

unix:perlio
€
unix:perlio
Wide character in say at ./p.pl line 16.
€

Upvotes: 1

brian d foy
brian d foy

Reputation: 132905

You aren't using STDIN. You're opening a file with an explicit encoding (except for your last example) and reading from that.

The use open qw(:std ...) affects the standard file handles, but you're only using standard output. When you don't use that and print UTF-8 data to standard output, you get the warning.

In your last example, you don't read the data with an explicit encoding, so when you print it to standard output, it's already corrupted.

That's the trick of encodings no matter what they are. Every part of the process has to be correct.

If you want use open to affect all file handles, you have to import it differently. There are several examples in the top of the documentation.

Upvotes: 3

Related Questions