user3934476
user3934476

Reputation: 27

Obtaining overlapping matches using a regex

See code first plz~

This is perl code.

my $st = 'aaaa';

while ( $st =~ /aa/g ) {
    print $&, "\n";
}

I want to move one point of the string.

So I want the results of the three aa.

However, only two results are obtained.

I can derive three results do?

Upvotes: 0

Views: 254

Answers (4)

ikegami
ikegami

Reputation: 386706

The following will do the trick:

while ($st =~ /(?=(aa))/g) {
   print "$1\n";
}

Upvotes: 1

Miller
Miller

Reputation: 35218

Your problem is that regular expressions do not normally allow overlapping matches.

You can explore this fact by outputting the Positional Information for your two current matches:

my $st = 'aaaa';

while ( $st =~ /aa/g ) {
    print "pos $-[0] - $&\n";
}

Outputs:

pos 0 - aa
pos 2 - aa

To fix this, you simply need to use a Positive Lookahead Assertion and an explicit capture group:

while ( $st =~ /(?=(aa))/g ) {
    print "pos $-[0] - $1\n";
}

Outputs:

pos 0 - aa
pos 1 - aa
pos 2 - aa

Upvotes: 1

choroba
choroba

Reputation: 242443

Use a look ahead. It doesn't advance the position:

my $st = 'abcd';

while ($st =~ /(?=(..))/g) {
    print "$1\n";
}

I used a different string to make the matching positions visible.

Upvotes: 1

mpapec
mpapec

Reputation: 50677

my $st = 'aaaa';
my $find = 'aa';

while($st =~ /$find/g){
    print $&,"\n";
    pos($st) -= (length($find)-1);
}

From perldoc pos

Returns the offset of where the last m//g search left off for the variable in question ($_ is used when the variable is not specified)

Also pos() is lvalue subroutine and result from it can be changed like for variable.

Upvotes: 2

Related Questions