Reputation: 8280
var regex = new Regex(@"{([A-z]*)(([^]|:)((\\:)|[^:])*?)(([^]|:)((\\:)|[^:])*?)}");
The expression is [crudely] designed to find tokens within an input, using the format: {name[:pattern[:format]]}
, where the pattern
and format
are optional.
{
([A-z]*) // name
(([^]|:)((\\:)|[^:])*?) // regex pattern
(([^]|:)((\\:)|[^:])*?) // format
}
Additionally, the expression attempts to ignore escaped colons, thus allowing for strings such as {Time:\d+\:\d+\:\d+:hh\:mm\:ss}
When testing on RegExr.com, everything works sufficiently, however when attempting the same pattern in C#, the input fails to match, why?
(Any advice for general improvements to the expression are very welcome too)
Upvotes: 4
Views: 138
Reputation: 626699
The [^]
pattern is only valid in JavaScript where it matches a not nothing, i.e. any character (although in ES5, it does not match the chars from outside the BMP plane). In C#, it is easy to match any char with .
and passing the RegexOptions.Singleline
modifier. However, in JS, the modifier is not supported, but you may match any char with [\s\S]
workaround pattern.
So, the minimum change you need to make to make both compatible in both regex flavors is to change ([^]|:)
to [\s\S]
because there is no need to use a :
as an alternative (since [\s\S]
will already match a colon).
Also, do not use [A-z]
as a shortcut to match ASCII letters. Either use [a-zA-Z]
or [a-z]
and pass a case insensitive modifier.
So, you might consider writing the expression as
{([A-Za-z]*)([\s\S]((\\:)|[^:])*?)([\s\S]((\\:)|[^:])*?)}
See a .NET regex test and a JS regex test.
Surely, there may be other enhancements here: remove redundant groups, add support for any escape sequences (not just escaped colons), etc., but it is out of the question scope.
Upvotes: 6