J. Wend
J. Wend

Reputation: 13

Regex to match all lines not starting with ==

This should be simple, but I've been having trouble with it. I want to write a regex in perl to match all lines that do not begin with an "==". I created this expression:

^[^\=\=].*

Which works fine in a regex tester I use, but when I run the perl script I get an error stating:

POSIX syntax [= =] is reserved for future extensions in regex

And the script terminates. I assume I'm using some syntax wrong, but I haven't found anything regarding this. Does anyone have a better way to match these lines?

Upvotes: 0

Views: 1959

Answers (2)

Borodin
Borodin

Reputation: 126722

You're misunderstanding how character classes work in regular expressions

A character class is delimited by square brackets [...] and generally will match any one of the characters that it encloses. So [abc] will match a, b, or c, but only the first character of aa or cbc. You probably know that you can also use ranges, such as [a-c]

You can also negate the class, as you have done, so [^a] will match any one character that isn't an a, such as z or &, but only the first character of zz

Replicating a character in a class will not change what it matches, so [aardvark] will match exactly one of a, d, k, r, or v, and is equivalent to [adkrv]

Your regex pattern uses the character class [^\=\=]. It's unnecessary to escape an equals sign, and replicating it has no effect, so you have the equivalent of [^=], which will match any single character other than the equals sign =

The reason you got that error message is that character classes beginning [= and ending =] (just [=] doesn't count) are reserved for special behaviour yet to be implemented. As above, there would ordinarily be no reason to write a character class with multiple occurrences of the same character, so it's reasonable to disallow such a construction

perldoc perldiag has this to say

POSIX syntax [= =] is reserved for future extensions in regex; marked by <-- HERE in m/%s/

(F) Within regular expression character classes ([]) the syntax beginning with "[=" and ending with "=]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "[=" and "=]". The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

A solution depends on how you want to use the test in your Perl code, but if you need an if statement then I would simply invert the test and check that the line doesn't start with ==

unless ( /^==/ )

or, if you're allergic to Perl's unless

if ( not /^==/ )

Upvotes: 2

Mauro Lacy
Mauro Lacy

Reputation: 389

Your regex is incorrect, as it fails with =A as input, by example. A way to do it would be with a Perl Compatible Regular Expression(PCRE): ^(?!==)

Upvotes: 3

Related Questions