Barton Chittenden
Barton Chittenden

Reputation: 4416

What do the non-printable characters in the Perl symbol table represent?

I just learned that in Perl, the symbol table for a given module is stored in a hash that matches the module name -- so, for example, the symbol table for the fictional module Foo::Bar would be %Foo::Bar. The default symbol table is stored in %main::. Just for the sake of curiosity, I decided that I wanted to see what was in %main::, so iterated through each key/value pair in the hash, printing them out as I went:

#! /usr/bin/perl

use v5.14;
use strict;
use warnings;

my $foo;
my $bar;
my %hash;

while( my ( $key, $value ) = each  %:: )  {
    say "Key: '$key' Value '$value'";
} 

The output looked like this:

Key: 'version::' Value '*main::version::'
Key: '/' Value '*main::/'
Key: '' Value '*main::'
Key: 'stderr' Value '*main::stderr'
Key: '_<perl.c' Value '*main::_<perl.c'
Key: ',' Value '*main::,'
Key: '2' Value '*main::2'
...

I was expecting to see the STDOUT and STDERR file handles, and perhaps @INC and %ENV... what I wasn't expecting to see was non-ascii characters ... what the code block above doesn't show is that the third line of the output actually had a glyph indicating a non-printable character.

I ran the script and piped it as follows:

perl /tmp/asdf.pl | grep '[^[:print:]]' | while read line
do 
    echo $line
    od -c <<< $line
    echo
done

The output looked like this:

Key: '' Value '*main::'
0000000   K   e   y   :       ' 026   '       V   a   l   u   e       '
0000020   *   m   a   i   n   :   : 026   '  \n
0000032

Key: 'ARNING_BITS' Value '*main::ARNING_BITS'
0000000   K   e   y   :       ' 027   A   R   N   I   N   G   _   B   I
0000020   T   S   '       V   a   l   u   e       '   *   m   a   i   n
0000040   :   : 027   A   R   N   I   N   G   _   B   I   T   S   '  \n
0000060

Key: '' Value '*main::'
0000000   K   e   y   :       ' 022   '       V   a   l   u   e       '
0000020   *   m   a   i   n   :   : 022   '  \n
0000032

Key: 'E_TRIE_MAXBUF' Value '*main::E_TRIE_MAXBUF'
0000000   K   e   y   :       ' 022   E   _   T   R   I   E   _   M   A
0000020   X   B   U   F   '       V   a   l   u   e       '   *   m   a
0000040   i   n   :   : 022   E   _   T   R   I   E   _   M   A   X   B
0000060   U   F   '  \n
0000064

Key: ' Value '*main:'
0000000   K   e   y   :       '  \b   '       V   a   l   u   e       '
0000020   *   m   a   i   n   :   :  \b   '  \n
0000032

Key: '' Value '*main::'
0000000   K   e   y   :       ' 030   '       V   a   l   u   e       '
0000020   *   m   a   i   n   :   : 030   '  \n
0000032

So what are non-printable characters doing in the Perl symbol table? What are they symbols for?

Upvotes: 10

Views: 773

Answers (2)

Ilmari Karonen
Ilmari Karonen

Reputation: 50328

Guru is on the right track: specifically, the answer is to be found in perlvar, which says:

"Perl variable names may also be a sequence of digits or a single punctuation or control character. These names are all reserved for special uses by Perl; for example, the all-digits names are used to hold data captured by backreferences after a regular expression match. Perl has a special syntax for the single-control-character names: It understands ^X (caret X) to mean the control-X character. For example, the notation $^W (dollar-sign caret W) is the scalar variable whose name is the single character control-W. This is better than typing a literal control-W into your program.

Since Perl 5.6, Perl variable names may be alphanumeric strings that begin with control characters (or better yet, a caret). These variables must be written in the form ${^Foo}; the braces are not optional. ${^Foo} denotes the scalar variable whose name is a control-F followed by two o's. These variables are reserved for future special uses by Perl, except for the ones that begin with ^_ (control-underscore or caret-underscore). No control-character name that begins with ^_ will acquire a special meaning in any future version of Perl; such names may therefore be used safely in programs. $^_ itself, however, is reserved."

If you want to print those names in a readable way, you could add a line like this to your code:

$key = '^' . ($key ^ '@') if $key =~ /^[\0-\x1f]/;

If first character of $key is a control character, this will replace it with a caret followed by the corresponding letter (^A for control-A, ^B for control-B, etc.).

Upvotes: 10

Guru
Guru

Reputation: 16974

Perl has special variables such as $", $, , $/ , $\ and so on. All these are part of symbol table which is what you are seeing. Also, you should be able to see @INC, %ENV in the symbol table as well.

Upvotes: 1

Related Questions