Jeff Sims
Jeff Sims

Reputation: 21

Perl regex not working on variable after it has already matched a previous regex

Please help me figure out why the following does not work like I expect it to. I should be able to skip all of the array items that don't match and then match on any of the ones that get through. Instead, I have to make a copy of the for loop variable $server and match on it after it has been matched by one regex and let through.The variable $server still contains the same string and I would expect to be able to match it to a second regex:

use strict;
use warnings;
use diagnostics;

@servers = ('server01', 'server02', 'vm13', 'vm02');

for my $server ( @servers) {

if ($server !~ m/server01|vm13|vm02/ig ) {

    next;

} else {
    say $server;  # It will print string that contains 
                  # server01, vm13, or vm02

    if ($server =~ m/server01/ig) {

        say $server # It will not print string that 
                    # contains server01 here
    }

    say $server, " again..."; # The variable still works here
}

This way does work:

use strict;
use warnings;
use diagnostics;

@servers = ('server01', 'server02', 'vm13', 'vm02');

for my $server (@servers) {

my $server_copy = $server;

if ($server !~ m/server01|vm13|vm02/ig ) {

    next;

} else {

    say $server;  # It will print the name of the server 
                    # that contains server01, vm13, or vm02

    if ($server_copy =~ m/server01/ig) {
        say $server # It now prints the name of that server
    }

    say $server, " again..."; # The variable still works here
}

Any help would be appreciated.

Upvotes: 1

Views: 105

Answers (3)

zdim
zdim

Reputation: 66883

In short, it's the global modifier /g in the regex used in scalar context that causes this behavior.

From perlretut, under Global matching

In scalar context, successive invocations against a string will have //g jump from match to match, keeping track of position in the string as it goes along

As it remembers its last position it only tries to match from there down the string the next time, as explained in answer by choroba. A very useful tool in this regard is use re qw(debug), with which you'll see in detail what a regex is doing.

I've changed the code a little bit as well.

use strict;
use warnings;
use feature qw(say);
use diagnostics;

my @servers = ('server01', 'server02', 'vm13', 'vm02');

foreach my $server (@servers) {

    next if not $server =~ m/server01|vm13|vm02/i;

    say $server;  # Prints string with either server01, vm13, or vm02

    if ($server =~ m/server01/i) {
        say "Looking for: $server";
    }   

    say "$server, again..."; # The variable still works here
}   

If the list of servers to keep is long you can make use of none from the core List::Util module.

use List::Util qw(none);

if (none { /$server/ } @keep_servers) {
    say "Skipping $server";
    next;
}

There are other ways to manipulate arrays (originally I used not any, thanks to Borodin for a note).

If you only need to skip them then of course you can simply loop over @keep_servers instead. Such a list can be constructed as, for example

my @keep_servers = grep { not /server02/ } @servers;

This may be suitable if you know which to drop and they form a much shorter list than those to keep.


  This module is in core as of Perl 5.20.

With a version prior to Perl 5.20 the functions all, any, none and notall can be found in List::MoreUtils module, while simple implementations were shown in List::Util docs.

Upvotes: 3

choroba
choroba

Reputation: 241848

That's because of the /g flag. It remembers the position where it matched last time, and tries to match from that position onwards the next time. You can verify it with the string server01_server01 - it will be printed because it matches the regex twice.

Remove the /g flag if you don't need it.

Upvotes: 2

amaksr
amaksr

Reputation: 7745

Change this line

if ($server_copy =~ m/server01/ig)

to this

if ($server_copy =~ m/server01/i)

Upvotes: 1

Related Questions