user14756437
user14756437

Reputation: 89

Is there a way to write a REGEX pattern over multiple lines?

I often end up with ultra-complex and long regexps. PCRE @ PHP.

For a long time, I've been searching and looking for a way to do something like:

    preg_match('#blablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablablablablabla...
blablabla#uis');

Instead of:

preg_match('#blablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablablablablabla...blablabla#uis');

If I make actual linebreaks, that will become part of the regular expression. Perhaps not as an actual linebreak, but as whitespace. Unless I'm completely mistaken.

Is there some character I can use in the end of each row to say: "this is supposed to all be one line"?

Upvotes: 4

Views: 374

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626747

You can use a HEREDOC that supports variable interpolation (or NOWDOC that does not support variable interpolation) with an x flag (modifier). See what the docs say about the quantifier:

x (PCRE_EXTENDED)
If this modifier is set, whitespace data characters in the pattern are totally ignored except when escaped or inside a character class, and characters between an unescaped # outside a character class and the next newline character, inclusive, are also ignored. This is equivalent to Perl's /x modifier, and makes it possible to include commentary inside complicated patterns. Note, however, that this applies only to data characters. Whitespace characters may never appear within special character sequences in a pattern, for example within the sequence (?( which introduces a conditional subpattern.

// HEREDOC
$pattern_with_interpolation = <<<EOD
/
blablabla...  # comment here
blablabla     # comment here
/uisx
EOD;

// NOWDOC
$pattern_without_interpolation = <<<'EOD'
/blablabla... # comment here
blablabla     # comment here
/uisx
EOD;

Mind that you need to escape all # and literal whitespace chars in the pattern since /x flag allows using comments at the end of a line after # and insert any literal whitespace with formatting meaning, they do not match the corresponding chars.

Example

$pattern_without_interpolation = <<<'EOD'
/
\d+      # one or more digits
\        # a single space
\p{L}+   # one or more letters
\#       # a literal hash symbol
/ux
EOD;
if (preg_match($pattern_without_interpolation, '1 pound#', $m)) {
    echo $m[0];
}
// => 1 pound#

See the PHP demo.

Upvotes: 2

Related Questions