Nathan Edwards
Nathan Edwards

Reputation: 13

Using RegEx to mach the beginning of string if end of string is not

I am trying to match lines in a configuration that start with the word "deny" but do not end with the word "log". This seems terribly elementary but I can not find my solution in any of the numerous forums I have looked. My beginners mindset led me to try "^deny.* (?!log$)" Why wouldn't this work? My understanding is that it would find any strings that begin with "deny" followed by any character for 0 or more digits where the end of line is something other than log.

Upvotes: 1

Views: 584

Answers (2)

MikeM
MikeM

Reputation: 13631

(?!log$) is a zero-width negative look-ahead assertion that means don't match if immediately ahead at this point in the string is log and the end of the string, but the .* in your regex has already greedily consumed all the characters right up to the end of the string so there is no way the log could then match.

If your regular expression implementation supports look-behinds you could use a regex such as in Josh Kelley's answer, if you were using javascript you could use

/^deny(?:.{0,2}|.*(?!log)...)$/m

The m flag means multiline mode, which makes ^ and $ match the start and end of every line rather than just the start and end of the string.

Note that three . are positioned after the negative look-ahead so that it has space to match log if it is there. Including these three dots meant it was also necessary to add the .{0,2} option so that strings with from zero to two characters after deny would also match. The (?:a|b) means a non-capturing group where a or b has to match.

Upvotes: 1

Josh Kelley
Josh Kelley

Reputation: 58362

When given a line like deny this log, your ^deny.*(?!log$) regex (I'm omitting the space that was in your sample question) is evaluated as follows:

  • ^deny matches "deny".
  • .* means "match 0 or more of any character", so it can match " this log".
  • ^(?!log$) means "make sure that the next characters aren't 'log' then the end of the line." In this case, they're not - they're just the end of the line - so the regex matches.

Try this regex instead:

^deny.*$(?<!log)

"Match deny at the beginning of the string, then match to the end of the line, then use a zero-width negative look-behind assertion to check that whatever we just matched at the end of the line is not 'log'."

With all of that said...

Regexes aren't necessarily the best tool for the job. In this case, a simple Boolean operator like

if (/^deny/ and not /log$/)

is probably clearer than a more advanced regex like

if (/^deny.*$(?<!log)/)

Upvotes: 2

Related Questions