Jeff
Jeff

Reputation: 737

Infinite while-loop in perl

Is there a way to do this without getting an infinite loop?

while((my $var) = $string =~ /regexline(.+?)end/g) {
    print $var;
}

This results in an infinite loop, probably because the assigning of a var directly from a regex inside the while returns "true" every time?

I know I can do this:

while($string =~ /regexline(.+?)end/g) {
     my $var = $1;      
     print $var;
}

But I was hoping I could save a line. Is there a regex modifier I can use or something like that?

(Also, what is this notation/trick actually called, if I want to search for it:

(my $var) = $string =~ /regex/;

Thanks!!

Upvotes: 10

Views: 7606

Answers (7)

tadmc
tadmc

Reputation: 3744

Is there a way to do this without getting an infinite loop?

Yes. Use a foreach() instead of a while() loop:

foreach my $var ($string =~ /regexline(.+?)end/g) {

what is this notation/trick actually called, if I want to search for it

It is called a match in list context. It is described in "perldoc perlop":

The g modifier specifies global pattern matching--that is, matching as many times as possible within the string. How it behaves depends on the context. In list context ...

Upvotes: 8

ikegami
ikegami

Reputation: 385655

This is the one circumstance where you can't avoid using the global vars without changing the behaviour.

while ($string =~ /regexline(.+?)end/g) {
    my $var = $1;
    ...
}

If you have only have one capture, you can avoid using the global vars by finding all the matches at once.

for my $var ($string =~ /regexline(.+?)end/g) {
    ...
}

The extra cost of the second version is usually negligible.

Upvotes: 1

TLP
TLP

Reputation: 67900

I don't know what you intend to do with this print, but this is a nice way of doing it:

say for $string =~ /regex(.+?)end/g;

The for (same as foreach) expands the regex match into a list of the capture groups, and prints them. Works like this:

@matches = $string =~ /regex(.+?)end/g;
say for (@matches);

while is somewhat different. Since it uses a scalar context, it does not load the capture groups into memory.

say $1 while $string =~ /regex(.+?)end/g;

It will do something like your original code, except we don't need to use a transition variable $var, we just print it right away.

Upvotes: 0

mercator
mercator

Reputation: 28656

The Perl regular expressions tutorial says:

In scalar context, successive invocations against a string will have //g jump from match to match, keeping track of position in the string as it goes along.

But:

In list context, //g returns a list of matched groupings, or if there are no groupings, a list of matches to the whole regexp.

That is to say, in list context //g returns an array of all your captured matches at once (of which you subsequently discard all but the first), and then does that all over again every time your loop executes (i.e. forever).

So you can't use the list context assignment in a loop condition, because it doesn't do what you want.

If you insist on using the list context, you could do this instead:

foreach my $var ($string =~ /regexline(.+?)end/g) {
    print $var;
}

Upvotes: 8

mob
mob

Reputation: 118605

In scalar context, a regular expression with the /g modifier will act like an iterator and return a false value when there are no more matches:

print "$1\n" while "abacadae" =~ /(a\w)/g;     # produces "ab","ac","ad","ae"

With the assignment inside the while expression, you are evaluating your regular expression in list context. Now your regular expression doesn't act like an iterator anymore, it just returns the list of matches. If the list is not empty, it evaluates to a true value:

print "$1\n" while () = "abacadae" =~ /(a\w)/g;   # infinite "ae"

To fix this, you can take the assignment out of the while statement and use the builtin $1 variable to make the assignment inside the loop?

while ($string =~ /regexline(.+?)end/g) {
    my $var = $1;
    print $var;
}

Upvotes: 10

Kevin Marshall
Kevin Marshall

Reputation: 1

I think your best bet is to just replace the $string within the loop...so:

while((my $var) = $string =~ /regexline(.+?)end/g) {
  $string =~ s/$var//;
  print $var . "\n";
}

Upvotes: 0

Ray Toal
Ray Toal

Reputation: 88378

There are several ways to do this with less code.

Let's say you have a file called lines.txt:

regexlineabcdefend
regexlineghijkend
regexlinelmnopend
regexlineqrstuend
This line does not match
Neither does this
regexlinevwxyzend

and you want to extract the pieces that match your regex, that is, chunks of the line between "regexline" and "end". A straightforward Perl script is:

while (<STDIN>) {
    print "$1\n" if $_ =~ /regexline(.+?)end/
}

When run like this

$ perl match.pl < lines.txt

you get

abcdef
ghijk
lmnop
qrstu
vwxyz

You can even do the whole thing on the commandline!

$ perl -nle 'print $1 if $_ =~ /regexline(.+?)end/' < lines.txt abcdef ghijk lmnop qrstu vwxyz

As far as your second question goes, I'm not sure there is a special Perl name for that trick.

Upvotes: 0

Related Questions