NoobDeveloper
NoobDeveloper

Reputation: 1887

Strip non ascii chars but allow currency symbols

I am using below regex to strip all non-ascii characters from a string.

String pattern = @"[^\u0000-\u007F]";
Regex rx = new Regex(pattern, RegexOptions.Compiled);
rx.Replace(data," ");

However, i want to allow use of curreny (pound symbol) and trademark symbols.

I have modified above regex as shown below & it works for me. Can anyone just confirm if the regex is valid ?

 String pattern = @"[^\u0000-\u007F \p{Sc}]";

Basically, I want to allow all currency symbols too.

Upvotes: 3

Views: 1042

Answers (1)

Oscar Mederos
Oscar Mederos

Reputation: 29843

Yes, your regex is correct.

What you are doing with your code is replacing the characters matched by your regular expressions by an empty character.

Now, what characters does your regular expression match?

Anything except:

If you just want to keep allowing some other characters, yes, you can add them too (exactly like you did with \p{Sc}.

Edit:

Be careful when doing it in the future. The regex would really be [^\u0000-\u007F\p{Sc}] (no space), although in this case it doesn't matter since the space character was already in the ASCII range.

Upvotes: 2

Related Questions