Coltrane58
Coltrane58

Reputation: 327

perl, having weird behavior with s///

I'm embarrassed to ask this because it is so simple, but I can't see what's wrong. I have a routine to clean up input for an ip range. It's sort of brute-force but I don't know of a better way. The problem I am having is when I try to remove inner spaces leaving only a '-' or ',' as separators within a conditional block a single preceding and trailing space is left bracketing the separator. If I clean up the inner spaces outside the conditional block the spaces are properly removed. So, in the sample code if I only have the s/\s+//g on Line 1 it cleans properly, if I only have the s/\s+//g on Lines 2 and 3 spaces are left bracketing the '-' and ','. What the heck is wrong?

use feature qw(say);
use Data::Dumper qw(Dumper);

$input = "   192.168.1.1       198.168.1.254     ";
buildIpRangeArray ($input);

$input = "   192.168.1.1  ,     198.168.1.254     ";
buildIpRangeArray ($input);

$input = "   192.168.1.1    -  198.168.1.254     ";
buildIpRangeArray ($input);

sub buildIpRangeArray {

    say "input: $input";

    $input = shift;
    $input =~ s/^\s+//;
    $input =~ s/\s+$//;
#    $input =~ s/\s+//g;                    # Line 1. 
                                            # Works if this is uncommented
                                            # and lines 2 and 3 are omitted

    if ( index($input,' ') >= 0) {
        $input =~ s/\s+/ /g;                # this works
        say "cleaned input 2: $input";
        @range = split(/ /,$input);
        say Dumper(@range);
    }
    elsif ( index($input,',') >= 0) {
        $input =~ s/\s+//g;                 # Line 2
        say "cleaned input 3: $input";
        @range = split(/,/, $input);
        say Dumper(@range);
    }
    elsif ( index($input,'-') >= 0) {
        $input =~ s/\s+//g;                 # Line 3
        say "cleaned input 4: $input";
        @range = split(/-/, $input);
        say Dumper(@range);
    }
}
The output:
input:    192.168.1.1       198.168.1.254     
cleaned input 2: 192.168.1.1 198.168.1.254
$VAR1 = '192.168.1.1';
$VAR2 = '198.168.1.254';

input:    192.168.1.1  ,     198.168.1.254     
cleaned input 2: 192.168.1.1 , 198.168.1.254
$VAR1 = '192.168.1.1';
$VAR2 = ',';
$VAR3 = '198.168.1.254';

input:    192.168.1.1    -  198.168.1.254     
cleaned input 2: 192.168.1.1 - 198.168.1.254
$VAR1 = '192.168.1.1';
$VAR2 = '-';
$VAR3 = '198.168.1.254';

Upvotes: 1

Views: 65

Answers (2)

simbabque
simbabque

Reputation: 54323

If you look at your debug output, it's fairly obvious what's going on. Let's take the second block of output with the comma.

input:    192.168.1.1  ,     198.168.1.254     
cleaned input 2: 192.168.1.1 , 198.168.1.254
$VAR1 = '192.168.1.1';
$VAR2 = ',';
$VAR3 = '198.168.1.254';

After input, there is a check to see if there's a space ' ' in the string. There is, here:

           V
192.168.1.1 , 198.168.1.254

Therefore the code never reaches the elsif for the comma , or the dash -. You can verify that because you always get input 2, never input 3 or input 4.

The next step is cleaning whitespace, where you say yourself it works. You replace lots of any whitespace with one space. That leaves , in the string. Now you split on whitespace, giving you

ip
,
ip

Overall, your code is fairly naive. There is a lot of repetition, and you don't have use strict or use warnings, which makes it harder to debug. Depending on how this code is going to be used, I suggest a huge simplification.

sub buildIpRangeArray {
    my $input = shift;

    say "input: $input";

    my @range = grep {$_} split /[^0-9.]+/, $input;
    say Dumper @range;
    return;
}

We split on lots of characters that can't be in an IP address. This is naive too, as it will not verify you have actual IP addresses, but neither does your code. It will work for any number of whitespace or any delimiters, even if they are text. We need the grep to remove empty strings that occur from leading or trailing whitespace. An empty string "" in Perl evaluates as false, so grep will filter these out.

Upvotes: 3

Georg Mavridis
Georg Mavridis

Reputation: 2331

Your first condition is always true, as the string always contains spaces.

So you have to write the if/elsif/elsif in another way.

if ( index($input,',') >= 0) {
    $input =~ s/\s+//g;                 # Line 2
    say "cleaned input 3: $input";
    @range = split(/,/, $input);
    say Dumper(@range);
}
elsif ( index($input,'-') >= 0) {
    $input =~ s/\s+//g;                 # Line 3
    say "cleaned input 4: $input";
    @range = split(/-/, $input);
    say Dumper(@range);
}
elsif ( index($input,' ') >= 0) {
    $input =~ s/\s+/ /g;                # this works
    say "cleaned input 2: $input";
    @range = split(/ /,$input);
    say Dumper(@range);
}

will probably yield the result you want.

Upvotes: 2

Related Questions