Reputation: 1514
Currently I am using a StringBuilder
to remove a list of characters from a string
as below
char[] charArray = {
'%', '&', '=', '?', '{', '}', '|', '<', '>',
';', ':', ',', '"', '(', ')', '[', ']', '\\',
'/', '*', '+', ' ' };
// Remove special characters that aren't allowed
var sanitizedAddress = new StringBuilder();
foreach (var character in emailAddress.ToCharArray())
{
if (Array.IndexOf(charArray, character) < 0)
sanitizedAddress.Append(character);
}
I tried to use Regex
for the same as follows
var invalidCharacters = Regex.Escape(@"%&=?{}|<>;:,\"()[]\\/*+\s");
emailAddress = Regex.Replace(emailAddress, invalidCharacters, "");
Upvotes: 1
Views: 67
Reputation: 186678
You can try using Linq (in order to filter out the unwanted characters with a help of Where
) instead of Regular Expressions:
using System.Linq;
...
// Hash set is faster on Contains operation than array - O(1) vs. O(N)
HashSet<char> toRemove = new HashSet<char>() {
'%', '&', '=', '?', '{', '}', '|', '<', '>',
';', ':', ',', '"', '(', ')', '[', ']', '\\',
'/', '*', '+', ' ' };
string emailAddress = ...
string emailAddress = string.Concat(emailAddress
.Where(c => !toRemove.Contains(c)));
You can add more Where
e.g.
string emailAddress = string.Concat(emailAddress
.Where(c => !toRemove.Contains(c))
.Where(c => !char.IsWhiteSpace(c))); // get rid of white spaces as well
In case you insist on regular expressions you have to build the pattern, e.g.:
char[] charArray = {
'%', '&', '=', '?', '{', '}', '|', '<', '>',
';', ':', ',', '"', '(', ')', '[', ']', '\\',
'/', '*', '+', ' ' };
// Joined with | ("or" in regular expressions) all the characters (escaped!)
string pattern = string.Join("|", charArray
.Select(c => Regex.Escape(c.ToString())));
And then you can Replace
:
string emailAddress = Regex.Replace(emailAddress, pattern, "");
Upvotes: 1
Reputation: 5908
You can use character set [...]
for this:
var invalidCharacters = "[" + Regex.Escape(@"%&=?{}|<>;:,""()\*/+") + @"\]\[\s]";
emailAddress = Regex.Replace(emailAddress, invalidCharacters, "");
Some side notes:
""
, not \"
\s
is alread an escaped sequence, so Regex.Escape
will render \\s
, which is not what you wantedRegex.Escape
don't seem to escape ]
character correctly - that's why it's added separatelyUpvotes: 1