Polar Bear
Polar Bear

Reputation: 6808

SOLVED: Hash content access is inconsistent with different perl version

I came across an interesting problem with following piece of code in perl 5.22.1 and perl 5.30.0

use strict;
use warnings;
use feature 'say';

#use Data::Dumper;

my %hash;
my %seen;
my @header = split ',', <DATA>;

chomp @header;

while(<DATA>) {
    next if /^\s*$/;
    chomp;
    my %data;
    @data{@header} = split ',';

    push @{$hash{person}}, \%data;
    push @{$hash{Position}{$data{Position}}}, "$data{First} $data{Last}";
    if( ! $seen{$data{Position}} ) {
        $seen{$data{Position}} = 1;
        push @{$hash{Role}}, $data{Position};
    }
}

#say Dumper($hash{Position});

my $count = 0;
for my $person ( @{$hash{person}} ) {
    say "Person: $count";
    say "Role: $person->{Position}";
}

say "---- Groups ----\n";

while( my($p,$m) = each %{$hash{Position}} ) {
    say "-> $p";
    my $members = join(',',@{$m});
    say "-> Members: $members\n";
}

say "---- Roles ----";

say '-> ' . join(', ',@{$hash{Role}});

__DATA__
First,Last,Position
John,Doe,Developer
Mary,Fox,Manager
Anna,Gulaby,Developer

If the code run as it is -- everything works fine

Now it is sufficient to add $count++ increment as bellow and code produces errors

my $count = 0;
for my $person ( @{$hash{person}} ) {
    $count++;
    say "Person: $count";
    say "Role: $person->{Position}";
}

Errors:

Error(s), warning(s):
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 22, <DATA> line 2.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 23, <DATA> line 2.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 24, <DATA> line 2.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 22, <DATA> line 3.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 23, <DATA> line 3.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 22, <DATA> line 4.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 23, <DATA> line 4.
Use of uninitialized value in concatenation (.) or string at source_file.pl line 35, <DATA> line 4.
Use of uninitialized value in concatenation (.) or string at source_file.pl line 35, <DATA> line 4.
Use of uninitialized value in concatenation (.) or string at source_file.pl line 35, <DATA> line 4.
Use of uninitialized value in join or string at source_file.pl line 48, <DATA> line 4.

This problem does not manifest itself in perl 5.30.0 (Windows 10, Strawberry Perl) or Perl v5.24.2.

Note: the problem manifests itself not only with $count++ but with any other access to content of the hash next to say "Person: $count"; -- post# 60653651

I would like to hear comments on this situation, what is the cause?

CAUSE: input data have eol in DOS form \r\n and when data processed in Linux chomp removes only \n leaving \r as part of the field name (used as hash key). Thanks goes to Shawn for pointing out the source of the issue.

SOLUTION: universal fix was implemented in form of snip_eol($arg) subroutine

use strict;
use warnings;
use feature 'say';

my $debug = 0;

say "
Perl:  $^V
OS: $^O
-------------------
" if $debug;

my %hash;
my %seen;
my @header = split ',', <DATA>;

$header[2] = snip_eol($header[2]);        # problem fix

while(<DATA>) {
    next if /^\s*$/;

    my $line = snip_eol($_);              # problem fix

    my %data;
    @data{@header} = split ',',$line;

    push @{$hash{person}}, \%data;
    push @{$hash{Position}{$data{Position}}}, "$data{First} $data{Last}";
    if( ! $seen{$data{Position}} ) {
        $seen{$data{Position}} = 1;
        push @{$hash{Role}}, $data{Position};
    }
}

#say Dumper($hash{Position});

my $count = 0;
for my $person ( @{$hash{person}} ) {
    $count++;
    say "-> Name:   $person->{First} $person->{Last}";
    say "-> Role:   $person->{Position}\n";
}

say "---- Groups ----\n";

while( my($p,$m) = each %{$hash{Position}} ) {
    say "-> $p";
    my $members = join(',',@{$m});
    say "-> Members: $members\n";
}

say "---- Roles ----";

say '-> ' . join(', ',@{$hash{Role}});

sub snip_eol {
    my $data = shift;                      # problem fix

    #map{ say "$_ => " . ord } split '', $data if $debug;
    $data =~ s/\r// if $^O eq 'linux';
    chomp $data;
    #map{ say "$_ => " . ord } split '', $data if $debug;

    return $data;
}

__DATA__
First,Last,Position
John,Doe,Developer
Mary,Fox,Manager
Anna,Gulaby,Developer

Upvotes: 1

Views: 96

Answers (1)

Shawn
Shawn

Reputation: 52579

I can replicate this behavior by (On linux) first converting the source file to have Windows-style \r\n line endings and then trying to run it. I thus suspect that in your testing of various versions you're using Windows sometimes, and a Linux/Unix other times, and not converting the file's line endings appropriately.

@chomp only removes a newline character (Well, the current value of $/ to be pedantic), so when used on a string with a Windows style line ending in it, it leaves the carriage return. The hash key is not "Position", it's "Position\r", which is not what the rest of your code uses.

Upvotes: 4

Related Questions