autarq
autarq

Reputation: 167

Regex for a multiline string with PHP preg_match

I'm trying to build a pattern for a multiline string, that must start with <?php or whitespace + <?php and NOT end with ?> or ?> + whitespace.

My attempt was /^\s?<\?php.*[^>]\s?$/s but it did not work. Also tried the negative lookahead - no use.

Any idea? Thanks in advance.

Upvotes: 2

Views: 3737

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627607

You can use

(?s)^\s*<\?php(?!.*\?>\s*$).*+$

See demo

Regex explanation:

  • (?s) - Enable singleline mode for the whole pattern for . to match newline
  • ^ - Start of string
  • \s* - Optional whitespace, 0 or more repetitions
  • <\?php - Literal <?php
  • (?!.*\?>\s*$) - Look-ahead checking if the string does not end with ?>whitespace
  • .*+$ - Matches without backtracking any characters up to the string end.

The possessive quantifier (as in .*+) enables us to consume characters once, in 1 go, and never come back in search of possible permutations.

Possessive quantifiers are a way to prevent the regex engine from trying all permutations. This is primarily useful for performance reasons.

And we do not to use explicit SKIP-FAIL verbs then.

Upvotes: 3

anubhava
anubhava

Reputation: 786359

In PHP, you can use this regex:

'/^\s*<\?php(?:.(?!\?>\s*$))*$/s'

RegEx Demo

  • ^\s*<\?php matches optional whitespaces and literal <?php at line start.
  • (?:.(?!\?>\s*$))* will match any 0 or more characters that don't end with ?>whitespace* using a negative lookahead.

Update: For efficiency this PCRE regex will perform faster than the previous one:

'/^\s*<\?php(?>.*\?>\s*$(*SKIP)(*F)|.*+$)/s'

RegEx Demo 2

  • (*FAIL) behaves like a failing negative assertion and is a synonym for (?!)
  • (*SKIP) defines a point beyond which the regex engine is not allowed to backtrack when the subpattern fails later
  • (*SKIP)(*FAIL) together provide a nice alternative of restriction that you cannot have a variable length lookbehind in above regex.

Upvotes: 2

Related Questions