mrk
mrk

Reputation: 3191

handle meta-characters in regular expressions

In C, when you escape a character, other than the built-in special characters, it's ASCII code remains unchanged: \+ is the same as +. I'm writing a regular expressions' engine and wonder how one could distinguish \+ and + for example.

Upvotes: 0

Views: 143

Answers (1)

ruakh
ruakh

Reputation: 183371

The usual solution is that the regex engine expects to see \+, so if the regex is coming from a string literal, then the programmer will have to write \\+. Oddly enough, this approach is even used in some languages that have built-in/standard regex support, so could offer special regex syntax.

In theory, an alternative approach is to use a different escape character — say, use + for "one or more" and '+ for "an actual plus sign" — so as not to conflict with that of string literals; but this approach seems to be infinitely less popular, for some reason.

Upvotes: 1

Related Questions