JamesBrownIsDead
JamesBrownIsDead

Reputation: 4715

.NET RegEx: Replace unsanitary characters

Let's say I have a string that can contain any UTF-16 characters, but I want to replace all characters not in a whitelist with an underscore. Let's say the whitelist is [A-Za-z], [0-9], and [-:.].

How would I use the Regex class to replace all characters not in the whitelist?

Upvotes: 1

Views: 623

Answers (1)

Steve Wortham
Steve Wortham

Reputation: 22240

You can do it with this:

[^A-Za-z0-9:.-]

The caret is the negation operator. So this will match every character that's not in the character class.

And then you simply replace the matches with an underscore like this:

Regex myRegex = new Regex(@"[^A-Za-z0-9:.-]", RegexOptions.Multiline);
return myRegex.Replace("your target string here", "_");

Here it is in action.

Upvotes: 4

Related Questions