PiotrB
PiotrB

Reputation: 143

Custom character class in C# regex

Is there any way to define custom character class in C# regex?

In flex it is done in very obvious way:

DIGIT    [0-9]
%%
{DIGIT}+    {printf( "An integer: %s (%d)\n", yytext, atoi( yytext ) );}

http://westes.github.io/flex/manual/Simple-Examples.html#Simple-Examples

As explained in this answer, in PHP defining a custom character class works like this:

(?(DEFINE)(?<a>[acegikmoqstz@#&]))\g<a>(?:.*\g<a>){2}

Is there a way to achieve this result in c#, without repeating the full character class definition each time it is used?

Upvotes: 4

Views: 1184

Answers (2)

Panagiotis Kanavos
Panagiotis Kanavos

Reputation: 131364

Custom character classes aren't supported in C# but you may be able to use named blocks and character class subtraction to get a similar effect.

.NET defines a large number of named blocks that correspond to Unicode character categories like math or Greek symbols. There may be a block that already matches your requirements.

Character class subtraction allows you to exclude the characters in one class or block from the characters in a broader class. The syntax is :

[ base_group -[ excluded_group ]]

The following example, copied from the linked documentation, matches all Unicode characters except whitespace, Greek characters, punctuation and newlines:

[\u0000-\uFFFF-[\s\p{P}\p{IsGreek}\x85]]

Upvotes: 3

Haney
Haney

Reputation: 34802

Nope, not supported in C#. This link will give you a nice overview of the .NET Regex engine. Note that nothing really stops you from defining variables and using them to construct your Regex string:

var digit = "[0-9]";
var regex = new Regex(digit + "[A-Z]");

Upvotes: 2

Related Questions