Mark Galeck
Mark Galeck

Reputation: 6385

why is backtracking involved in this Perl regex match?

Yesterday I asked a question latest Perl won't match certain regexes more than 32768 characters long to explain why

>./perl -e 'print (("a".("f"x32768)."a") =~ /a(?:[^a]|bb)*a/)'

does not work as expected (does not match) and the answer was, because of a backtracking depth limitation/bug of Perl regexes.

Why does the above match involve backtracking? The first a is matched, then, all the fs are matched by [^a]*, no need for the alternative bb, and then the last a is not [^a], neither bb, then the engine stops matching ([^a]|bb)* and then matches the last a and done. No backtracking needed.

The bug, occurs, when there is too much backtracking. OK, so the whole thing should work like this. The regex engine starts to match, proceeds to the end without backtracking, finishes with a match, no backtracking needed, the bug is not involved. Why doesn't it work that way?

Upvotes: 1

Views: 318

Answers (1)

ikegami
ikegami

Reputation: 385754

There's no backtracking for affffffa, but that doesn't mean the pattern can't backtrack. It'll require backtracking to find that affffff doesn't match. As such, the pattern needs to be ready to backtrack.

Upvotes: 2

Related Questions