Mose
Mose

Reputation: 563

Perl Hashref Substitutions

I'm using DBI to connect to Sybase to grab records in a hash_ref element. The DBI::Sybase driver has a nasty habit of returning records with trailing characters, specifically \x00 in my case. I'm trying to write a function to clean this up for all elements in the hashref, the code I have below does the trick, but I can't find a way to make it leaner, and I know there is away to do this better:

#!/usr/bin/perl

my $dbh = DBI->connect('dbi:Sybase:...');

my $sql = qq {SELECT * FROM table WHERE age > 18;};
my $qry = $dbh->selectall_hashref($sql, 'Name');

        foreach my $val(values %$qry) {
                $qry->{$val} =~ s/\x00//g;
        }
        foreach my $key(keys %$qry) {
                $qry->{$key} =~ s/\x00//g;
                foreach my $val1(keys %{$qry->{$key}}) {
                        $qry->{$key}->{$val1} =~ s/\x00//g;
                }
                foreach my $key1(keys %{$qry->{$key}}) {
                        $qry->{$key}->{$key1} =~ s/\x00//g;
        }

Upvotes: 0

Views: 315

Answers (2)

TLP
TLP

Reputation: 67900

While I think that a regex substitution is not exactly an ideal solution (seems like it should be fixed properly instead), here's a handy way to solve it with chomp.

use Data::Dumper;

my %a = (
    foo => {
        a => "foo\x00",
        b => "foo\x00"
    },
    bar => {
        c => "foo\x00",
        d => "foo\x00"
    },
    baz => {
        a => "foo\x00",
        a => "foo\x00"
    }
);
$Data::Dumper::Useqq=1;
print Dumper \%a;
{
    local $/ = "\x00";
    chomp %$_ for values %a;
}
print Dumper \%a;

chomp will remove a single trailing value equal to whatever the input record separator $/ is set to. When used on a hash, it will chomp the values.

As you will note, we do not need to use the values directly, as they are aliased. Note also the use of a block around the local $/ statement to restrict its scope.

For a more manageable solution, it's probably best to make a subroutine, called recursively. I used chomp again here, but you can just as easily skip that and use s/\x00//g. Or tr/\x00//d, which basically does the same thing. chomp is only safer in that it only removes characters from the end of the string, like s/\x00$// would.

strip_null(\%a);
print Dumper \%a;

sub strip_null {
    local $/ = "\x00";
    my $ref = shift;
    for (values %$ref) {
        if (ref eq 'HASH') {
            strip_null($_); # recursive strip
        } else {
            chomp;
        }
    }
}

Upvotes: 1

Eric Strom
Eric Strom

Reputation: 40142

First your code:

   foreach my $val(values %$qry) {
            $qry->{$val} =~ s/\x00//g;  
            # here you are using a value as if it was a key
    }
    foreach my $key(keys %$qry) {
            $qry->{$key} =~ s/\x00//g;
            foreach my $val1(keys %{$qry->{$key}}) {
                    $qry->{$key}->{$val1} =~ s/\x00//g;
            }

            foreach my $key1(keys %{$qry->{$key}}) {
                    $qry->{$key}->{$key1} =~ s/\x00//g;
    }
             # and this does the same thing twice...

what you should do is:

foreach my $x (values %$qry) {
    foreach my $y (ref $x eq 'HASH' ? values %$x : $x) {
        $y =~ s/(?:\x00)+$//
    }
}

which will clean up only ending nulls in the values of two levels of the hash.

the body of the loop could also be written as:

    if (ref $x eq 'HASH') {
        foreach my $y (values %$x) {
            $y =~ s/(?:\x00)+$//
        }
    }
    else {
        $x =~ s/(?:\x00)+$//
    }

But that forces you to write the substitution twice, and you shouldn't repeat yourself.

Or if you really want to reduce the code, using the implicit $_ variable works well:

for (values %$qry) {
    s/(?:\x00)+$// for ref eq 'HASH' ? values %$_ : $_
}

Upvotes: 1

Related Questions