Reputation: 20604
I am writing a tool to help students learn regular expressions. I will probably be writing it in Java.
The idea is this: the student types in a regular expression and the tool shows which parts of a text will get matched by the regex. Simple enough.
But I want to support several different regex "flavors" such as:
Java has the java.util.Regex class, but it supports only Perl-style regular expressions, which is a superset of the basic and extended REs. What I think I need is a way to take any given regular expression and escape the meta-characters that aren't part of a given flavor. Then I could give it to the Regex object and it would behave as if it was written for the selected RE interpreter.
For example, given the following regex:
^\w+[0-9]{5}-(\d{4})?$
As a basic regular expression, it would be interpreted as:
^\\w\+[0-9]\{5\}-\(\\d\{4\}\)\?$
As an extended regular expression, it would be:
^\\w+[0-9]{5}-(\\d{4})?$
And as a Perl-style regex, it would be the same as the original expression.
Is there a "regular expression for regular expressions" than I could run through a regex search-and-replace to quote the non-meta characters? What else could I do? Are there alternative Java classes I could use?
Upvotes: 1
Views: 1190
Reputation: 103824
If your target is a Unix / Linux system, why just shell out to the definitive host of each regex? ie, use grep for BRE, egrep for ERE, perl for PCRE, etc? The only thing your module would need to do is the UI. Most of the regex testers that I have seen (that are decent) use a variant of this approach.
If you want yet another library suggestion, look at TRE for the BRE / ERE / POSIX / AWK part. It does not support back references, so PCRE / Python / Ruby / JS / Java is out...
Upvotes: 1
Reputation: 13867
if you want your students to learn regex,why not use a freely available tool -- regex Coach -- http://www.weitz.de/regex-coach/ on the net that is pretty good to learn and evaluate regexes ?
look at this SO thread on a similar issue -- https://stackoverflow.com/questions/89718/is-there-anything-like-regexbuddy-in-the-open-source-world
BR,
~A
Upvotes: 0
Reputation: 89171
I have written something similar: Is there a regular expression to detect a valid regular expression?
You could take part of that expression, and match each token separatly:
[^?+*{}()[\]\\] # literal characters
\\[A-Za-z] # Character classes
\\\d+ # Back references
\\\W # Escaped characters
\[\^?(?:\\.|[^\\])+?\] # Character classs
\((?:\?[:=!>]|\?<[=!])? # Beginning of a group
\) # End of a group
(?:[?+*]|\{\d+(?:,\d*)?\})\?? # Repetition
\| # Alternation
For each match, you could have some dictionary of appropriate replacements in the target flavor.
Upvotes: 1
Reputation: 29143
check out this post for a 'regular expression for regular expressions': Is there a regular expression to detect a valid regular expression?
You can use this as a basis for your module.
Upvotes: 1
Reputation: 50237
Alternatively, you could use Jakarta ORO?
This supports the following regex 'flavors':
Upvotes: 1