Ran
Ran

Reputation: 725

Removing special characters from a string with RegEx

Am reading a text file that contains words, numbers and special characters, I want to remove certain special characters like: [](),'

I have this code but it is not working !

using (var reader = new StreamReader ("C://Users//HP//Documents//result2.txt")) {
            string line = reader.ReadToEnd ();

            Regex rgx = new Regex ("[^[]()',]");
            string res = rgx.Replace (line, "");
            Message1.text = res;

        }

what am I missing, thanks

Upvotes: 1

Views: 243

Answers (4)

glenebob
glenebob

Reputation: 1973

I agree with avoiding regex for this, but I would not use string.Replace multiple times, either.

Consider implementing a Replace or Remove method that accepts an array of characters to replace, and scan the input string only once. For example:

var builder = new StringBuilder();

foreach (char ch in input)
{
    if (!chars.Contains(ch))
    {
        builder.Append(ch):
    }
}

return builder.ToString();

Upvotes: 0

maccettura
maccettura

Reputation: 10818

You could skip Regex and just maintain a list of characters you want to remove and then replace the old fashioned way:

string[] specialCharsToRemove = new [] { "[", "]", "(", ")", "'", "," };

using (var reader = new StreamReader ("C://Users//HP//Documents//result2.txt")) 
{
    string line = reader.ReadToEnd();
    foreach(string s in specialCharsToRemove)
    {
        line = line.Replace(s, string.Empty);
    } 
    Message1.text = res;            
}

Ideally this would be in its own method, something like this:

private static string RemoveCharacters(string input, string[] specialCharactersToRemove)
{
    foreach(string s in specialCharactersToRemove)
    {
        input = input.Replace(s, string.Empty);
    }
    return input;
}

I made a fiddle here

Upvotes: 1

Rotem
Rotem

Reputation: 21917

Some of the characters in your Regex, specifically [ ] ( ) ^, hold special meaning in Regex and in order to use them literally they must be escaped.

Use the following properly escaped Regex:

Regex rgx = new Regex (@"[\^\[\]\(\)',]");

Note that it is necessary to use the @ verbatim string, because we don't want to escape these characters from the string, only from the Regex.

Alternatively, double escape the backslashes:

Regex rgx = new Regex ("[\\^\\[\\]\\(\\)',]");

But that's less readable in this case.

Upvotes: 3

Ctznkane525
Ctznkane525

Reputation: 7465

Replace them one at a time with String.Replace:

using (var reader = new StreamReader ("C://Users//HP//Documents//result2.txt")) 
{
        string line = reader.ReadToEnd ();

        string res = line.Replace(line, "[", "");
        res  = res.Replace(line, "]", "");
        res  = res.Replace(line, "(", "");
        res  = res.Replace(line, ")", "");
        res  = res.Replace(line, "'", "");
        res  = res.Replace(line, ",", "");
        Message1.text = res;

    }

Upvotes: 0

Related Questions