Icemanind
Icemanind

Reputation: 48686

RegEx standards across languages

I am asking this question because I notice there are some slight differences in the syntax of RegEx between different languages.

I am wondering if there is a RegEx standard that is maintained somewhere? And if so, where can I find this document? Also, if I create a RegEx expression in .NET, is the same expression guaranteed to be 100% compatible and work with other languages, such as Perl or Javascript or Java?

Finally, are there any "best practices" when it comes to using RegEx that can help to make it more maintainable across other platform languages?

Upvotes: 19

Views: 8226

Answers (3)

Anirudha
Anirudha

Reputation: 32797

Best Practices

Avoid the use of positive-negative lookbehinds and in some cases lookaheads

Upvotes: 0

stema
stema

Reputation: 92986

No there isn't such a standard. Of course there is PCRE, POSIX BRE, POSIX ERE, ...

But in fact there will be "small" differences in any language. You can relay on very basic things for most flavours, like the . for any character or the quantifiers +*?, character classes are also common, but it already starts at predefined classes like \w, is it supported at all? or ASCII based or Unicode?

A good help here is the flavor comparison on regular-expressions.info by Jan Goyvaerts.

Upvotes: 0

Jonathan Leffler
Jonathan Leffler

Reputation: 753695

One of the oldest sets of standardized regular expressions are the POSIX BRE (basic regular expressions) and ERE (extended regular expressions), documented under Regular Expressions.

Other languages may define their own standards. For example, C++ 2011 has a regular expression library defined in clause 28 (about 46 pages of standard). Perl defines its regular expressions. Other languages borrow from these sources and others. Lex and Flex use their own set of regular expressions. Sed uses its own variant on regular expressions. And Java, JavaScript, and ... define their own versions, sometimes using PCRE (Perl-Compatible Regular Expressions) as the basis for their design. Some of the details are affected by the facilities provided by the language in which the regular expressions are being used.

Jeff Friedl's book Mastering Regular Expressions covers a lot of different sets of regular expressions, identifying what's common and what's different.

Upvotes: 21

Related Questions