Reputation: 10128
Is there a regular expression that matches valid regular expressions?
(I know there are several flavors of regexps. One would do.)
Upvotes: 33
Views: 5910
Reputation: 22009
If you merely want to check whether a regular expression is valid or not, simply try to compile it with whichever programming language or regular expression library you're working with.
Parsing regular expressions is far from trivial. As the author of RegexBuddy, I have been around that block a few times. If you really want to do it, use a regex to tokenize the input, and leave the parsing logic to procedural code. That is, your regex would match one regex token (^
, $
, \w
, (
, )
, etc.) at a time, and your procedural code would check if they're in the right order.
Upvotes: 38
Reputation: 2567
From Douglas Crockford's The JavaScript Programming Language video 4 (of 4):
/\/(\\[^\x00-\x1f]|\[(\\[^\x00-\x1f]|[^\x00-\x1f\\\/])*\]|[^\x00-\x1f\\\/\[])+\/[gim]*/
http://video.yahoo.com/watch/111596/1710658 at approximately -17.20.
Upvotes: 4
Reputation: 49218
Is there a regular expression that matches valid regular expressions?
By definion, it's quite simple: No.
The language of all regexes is no regular language (just look at nested parentheses) and therefore there can't be a regular expression to parse it.
Upvotes: 42
Reputation: 25996
As already said, you cannot describe regular expressions with a regular expression due to their recursive nature. You'll need a context free grammar for that.
But what would be the point of having such a regular expression, anyway? If you just want to check whether a regular expression is correct, you can simply try to use it (Pattern.compile(regexp) in Java) and if it screams it is not valid.
Upvotes: 9
Reputation: 41152
You probably need a parser, not a regex. Regexes are powerful tools, but are not parsing tools. They are not well suited to nested grammars, for example.
Upvotes: 5
Reputation: 21620
Unfortunately, most invalid regular expressions are invalid due to parentheses nesting errors. This is exactly the type of strings that regular expressions can't match. (Okay, some fancy regular expression systems have recursion extensions, but that's rare)
Upvotes: 15
Reputation: 41867
Depending on your goal I would say definately maybe.
If you want to filter regexps out from somewhere, it might prove difficult as regular expressions come in all sizes and shapes and they don't all start and end with slashes.
If you just need to know wether or not a regexp is valid there is another way. Depending on the language you're using you could try/catch
If you can be more specific I could try and give a better answer, the question is intruiging.
Upvotes: -1