Lane
Lane

Reputation: 695

Why $_ =~ "regular expression" is valid in Perl?

I know in Perl, a most common valid regular expression is like this:

$_ =~ m/regular expression/;

# and "m" can be omit
$_ =~ /regular expression/;

And I can use qr to create a regular expression reference like this:

my $regex = qr/regular expression/;
$_ =~ m/$regex/;

# and "m//" can be omit:
$_ =~ $regex;

But I have tried this:

my $str = "regular expression";
$_ =~ $str; # why this is valid?

It didn't give me any error infomation and worked fine. I don't know why, I think it should be like:

my $str = "regular expression";
$_ =~ m/$str/;

# or
my $str = "regular expression";
my $regex = qr/$str/;
$_ =~ $regex;

Can anyone explain why $_ =~ $str is valid in Perl?

Upvotes: 4

Views: 618

Answers (3)

zdim
zdim

Reputation: 66883

It says under "The basics" in perlre

Patterns that aren't already stored in some variable must be delimitted, at both ends, by delimitter characters.

( along with the incorrect double-t in delimite(d/r) ) And there is a bit more liberty than what this seems to allow, as patterns in a string don't need delimiters either. See below.

Thus, a pattern in a variable just doesn't need delimiters. The operator =~ discussed in "Binding operators" in perlop

binds a scalar expression to a pattern match.

and (with my emphasis)

If the right argument is an expression rather than a search pattern, substitution, or transliteration, it is interpreted as a search pattern at run time.

The operator doesn't care for delimiters on its right hand side and a "regex pattern" can be formed at runtime out of an expression.

The section "Gory details of parsing quoted constructs" in perlop helps with this as well, apart from being illuminating in its own right. After the quoted construct is identified and the contained text interpolated it comes to the bullet "parsing regular expressions"

After preprocessing described above ... the resulting string is passed to the RE engine for compilation.

(my emphasis)

This is a general discussion of how Perl handles quoted constructs and there is no requirement for (extra) delimiters once the string is formed out of the quoted construct. The m/RE/ (etc) are discussed earlier in the "interpolation" bullet, what shows some of the things that can't be used with a plain string for a pattern, but that is clearly not compulsory to have.

An example

"hello" =~ ( '(' . join('|', qw(a b l)) . ')' );  # now $1 has "l"

The outer parenthesis on the RHS are there for precedence only, to isolate the expression yielding a pattern. One can use a do block, as well: $s =~ do { ... };

I'd recommend against this though; use qr, as you expect. For one thing, using a string (and not a regex built with qr) is limiting. Also, it is more prone to silly errors.


Note that while for many patterns one can use either qr or "" (or its operator form qq()) to prepare the pattern (or the string which will be interpreted that way) -- they are not the same. Their quoting rules are quite similar but the qr prepares a regular expression which, as put in Regexp Quote-Like Operators

... magically differs from a string containing the same characters ...

For one, recall that with qr you may use modifiers.

Upvotes: 8

ikegami
ikegami

Reputation: 385809

This is answered by the documentation for =~ in perlop:

If the right argument is an expression rather than a search pattern, substitution, or transliteration, it is interpreted as a search pattern at run time.


There are only a few things that can legitimately follow =~:

  • A match operator (m//)
  • A substitution operator (s/// )
  • A transliteration operator (tr///)

Now, Perl could give a syntax error as you expect if anything else if found on the right-hand side of =~. But it does something far more useful instead. If it finds something other than the above operators, the result of the expression is used as the pattern for an implicit match operator.

This conveniently allows

$s =~ get_pattern()               # do { my $pat = get_pattern(); $s =~ /$pat/ }

and

$s =~ ( $sub_pat1 . $sub_pat2 )   # do { my $pat = $sub_pat1 . $sub_pat2; $s =~ /$pat/ }

Upvotes: 2

Rafael
Rafael

Reputation: 7746

Perl strives to be a natural language, as such, these customary forms, '' and "", may have different generic forms depending on the context. Here is the table taken straight out of Programming Perl, 4th Edition (pg. 71), Table 2-7. Quote constructs:

+-----------+---------+-----------------------+--------------+
| Customary | Generic | Meaning               | Interpolates |
+-----------+---------+-----------------------+--------------+
| ''        | q//     | Literal string        | No           |
+-----------+---------+-----------------------+--------------+
| ""        | qq//    | Literal string        | Yes          |
+-----------+---------+-----------------------+--------------+
| ``        | qx//    | Command execution     | Yes          |
+-----------+---------+-----------------------+--------------+
| ()        | qw//    | Word list             | No           |
+-----------+---------+-----------------------+--------------+
| //        | m//     | Pattern match         | Yes          |
+-----------+---------+-----------------------+--------------+
| s///      | s///    | Pattern substitution  | Yes          |
+-----------+---------+-----------------------+--------------+
| tr///     | y///    | Character translation | No           |
+-----------+---------+-----------------------+--------------+
| ""        | qr//    | Regular expression    | Yes          |
+-----------+---------+-----------------------+--------------+

Example:

The string is converted to a pattern in this example. You have to take care here though, for when you construct patterns from double-quoted strings, you must escape the slash.

You can clearly see here:

my $pat = "hello\\s+world"; #double-slash to escape the slash

if ("hello       world" =~ $pat) {
    print "hello, world\n";
}

output:

hello, world

Upvotes: 3

Related Questions