Paolo
Paolo

Reputation: 2472

Negation of a regular expression

I am not sure how it is called: negation, complementary or inversion. The concept is this. For example having alphabet "ab"

R = 'a'
!R = the regexp that matche everyhting exept what R matches

In this simple example it should be soemthing like

!R = 'b*|[ab][ab]+'

How is such a regexp called? I remeber from my studies that there is a way to calculate that, but it is something complicated and generally too hard to make by hand. Is there a nice online tool (or regular software) to do that?

Upvotes: 3

Views: 1351

Answers (2)

The Beruriah Incident
The Beruriah Incident

Reputation: 3257

jbo5112's answer gives good practical help. However, on the theoretical side: a regular expression corresponds to a regular language, so the term you're looking for is complementation.

To complement a regex:

  1. Convert into the equivalent NFA. This is a well-known and defined process.
  2. Convert the NFA to a DFA via the powerset construction
  3. Complement the DFA by making accept states not accept and vice versa.
  4. Convert the DFA to a regular expression.

You now have the complement of the original regular expression!

Upvotes: 4

jbo5112
jbo5112

Reputation: 833

If all you're doing is searching, then some software/languages for regular expressions have a way to negate the match built in. For example, with grep you can use a '-v' option to get lines that don't match and the SQL variants I've seen allow you to use a 'not' qualifier to negate the match.

Another option that some/most/all regex dialects support is to use "negative lookahead". You may have to look up your specific syntax, but it's an interesting tool that is well worth reading about. Generally it's something like this: if R='<regex>', then Negative_of_R='(?!<regex>)'. Unfortunately, it can vary with the peculiarities of your language (e.g. vim uses \(<regex>\)\@!).

A word of caution: If you're not careful, a negated regular expression will match more than you expect. If you have the text This doesn't match 'mystring'. and search for (?!mystring), then it will match everything except the 'm' in mystring.

Upvotes: 3

Related Questions