Rock
Rock

Reputation: 347

How For loop will work in perl

#!/usr/bin/perl
@lines = `perldoc -u -f atan2`;
foreach (@lines) {
  s/\w<([^>]+)>/\U$1/g;
  print;
}

How will the expression s/\w<([^>]+)>/\U$1/g;work?

Upvotes: 3

Views: 192

Answers (3)

Bee
Bee

Reputation: 958

Here is an another option to figure out what it is doing. Use the module YAPE::Regex::Explain from CPAN.

Using it in this fashion (This is just the match part of the search and replace):

use strict;
use YAPE::Regex::Explain;

print YAPE::Regex::Explain->new(qr/\w<([^>]+)>/)->explain();

Will give this output:

The regular expression:

(?-imsx:\w<([^>]+)>)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  \w                       word characters (a-z, A-Z, 0-9, _)
----------------------------------------------------------------------
  <                        '<'
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    [^>]+                    any character except: '>' (1 or more
                             times (matching the most amount
                             possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  >                        '>'
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

The substitute part of the expression is stating that the match which was made earlier between "group and capture to \1" and "end of \1" should be converted to uppercase.

Upvotes: 4

Loki Astari
Loki Astari

Reputation: 264381

The perl loop looks like this:

foreach $item (@array)
{
   # Code in here. ($item takes a new value from array each iteration)
}

But perl allows you to leave out variables nearly everywhere.
When you do this the special variable $_ is used.

So in your case:

foreach (@lines) 
{
}

Is exactly the same as:

foreach $_ (@lines) 
{
}

Now inside the body the following code:

s/\w<([^>]+)>/\U$1/g;

Has the same thing happening. You are actually working on a variable. And when you do not specify a variable perl defaults to $_.

Thus it is the equivalent of:

$_ =~ s/\w<([^>]+)>/\U$1/g;

Combine the two:

foreach (@lines) {
  s/\w<([^>]+)>/\U$1/g;
  print;
}

Is equivalent too:

foreach $item (@lines)
{
    $item =~ s/\w<([^>]+)>/\U$1/g;
    print $item;
}

I use $item just for readability. Internally it means $_.

Lots of perl code uses this type of shortcut. Personally I think it makes it harder to read (even for experienced perl programmers (its one of the reason perl got a reputation for unreadability)). As a result I always try and be explicit about the use of variables (but this (my usage) is not typical perl usage).

Upvotes: 0

TLP
TLP

Reputation: 67900

The substitution does this:

s/             
    \w<         # look for a single alphanumeric character followed by <
    ([^>]+)     # capture one or more characters that are not <
    >           # followed by a >
/               ### replace with
   \U           # change following text to uppercase
   $1           # the captured string from above
/gx             # /g means do this as many times as possible per line

I added the /x modifier to be able to visualize the regex. The character class [^>] is negated, as denoted by the ^ character after the [, which means "any character except >".

For example, in the output from the perldoc command

X<atan2> X<arctangent> X<tan> X<tangent>

Is changed to

ATAN2 ARCTANGENT TAN TANGENT

Upvotes: 4

Related Questions