chx
chx

Reputation: 11790

PCRE "or" operator behavior?

If I have 1) /foo|oo/ 2) /oo|foo/ and using PCRE and I match it against the string "foo" the expected result is

  1. 1) foo 2) oo. PCRE keeps "OR" order.
  2. foo. PCRE tries all variants and goes for longest match.
  3. There is no preset rule, the optimizer might reorder as it sees fit. It is the duty of the developer to avoid ambiguous scenarios like this.
  4. There is a rule but it's not 2.

"Try it and see" seems to kill 1.) but there is no way to determine between 2-3-4 just by trial and error.

Upvotes: 2

Views: 3888

Answers (1)

Amadan
Amadan

Reputation: 198476

4) Get the match closest to the start of the string. When multiple matches are possible from the current position, match the option that matches sooner.

e.g.

banana matching against /na/ (showing the match with uppercase): baNAna (sooner than banaNA). Against /an|b/, matches Banana (sooner than bANana). Against /ba|./, matches BAnana (same position, so ba matches before .). Against /.|ba/, matches Banana (same position, so . matches before ba).

Upvotes: 2

Related Questions