Alexander Gelbukh
Alexander Gelbukh

Reputation: 2230

Regular expression for start and end of string in multiline mode

In a regular expression, in multiline mode, ^ and $ stand for the start and end of line. How can I match the end of the whole string?

In the string

Hello\nMary\nSmith\nHello\nJim\nDow

the expression

/^Hello(?:$).+?(?:$).+?$/ms

matches Hello\nMary\Smith.

I wonder whether there is a metacharacter (like \ENDSTRING) that matches the end of the whole string, not just line, such that

/^Hello(?:$).+?(?:$).+?\ENDSTRING/ms

would match Hello\nJim\nDow. Similarly, a metacharacter to match the start of the whole string, not a line.

Upvotes: 8

Views: 11151

Answers (2)

zdim
zdim

Reputation: 66873

There are indeed assertions (perlre) for that

\A Match only at beginning of string
\Z Match only at end of string, or before newline at the end

...
The \A and \Z are just like ^ and $, except that they won't match multiple times when the /m modifier is used, while ^ and $ will match at every internal line boundary. To match the actual end of the string and not ignore an optional trailing newline, use \z.

Also see Assertions in perlbackslash.

I am not sure what you're after in the shown example so here is another one

perl -wE'$_ = qq(one\ntwo\nthree); say for /(\w+\n\w+)\Z/m'

prints

two
three

while with $ instead of \Z it prints

one
two

Note that the above example would match qq(one\ntwo\three\n) as well (with a trailing newline), what may or may not be suitable. Please compare \Z and \z from the above quote for your actual needs. Thanks to ikegami for a comment.

Upvotes: 11

ikegami
ikegami

Reputation: 385590

\A and \z always match the beginning and the end of the string, respectively.

       without /m              with /m

\A     Beginning of string     Beginning of string
^      \A                      \A|(?<=\n)

\z     End of string           End of string
\Z     \z|(?=\n\z)             \z|(?=\n\z)
$      \z|(?=\n\z)             \z|(?=\n)

Put differently,

┌─────────────────── `\A` and `^`
│     ┌───────────── `(?m:$)`
│     │ ┌─────────── `(?m:^)`
│     │ │     ┌───── `\Z` and `$`
│     │ │     │ ┌─── `\z`, `\Z` and `$`
│     │ │     │ │
F o o ␊ B a r ␊

Remember, all of these matches are zero-length.

Upvotes: 3

Related Questions