SoundChaos
SoundChaos

Reputation: 307

How to avoid using multiple regex.replace

Goal is to take in a text file, normalize it down to only having all upper case letters, remove all special characters, and turn any new line into a single space.

This is my current messy code to do it, and as far as I can tell it does work.

public string readTextFile(string fileName)
{
    Regex rgx = new Regex("[^A-Z ]");
    string txtFile = File.ReadAllText(fileName).ToUpper();

    txtFile = Regex.Replace(txtFile, @"\s+", " ", RegexOptions.Multiline);
    return rgx.Replace(txtFile, "");
}

Looking for anyone to help clean this code up, improve efficiency, and possibly combine my regex statements to one.

Upvotes: 1

Views: 91

Answers (1)

Grundy
Grundy

Reputation: 13381

You can combine your regex, and use Replace method with MatchEvaluator like this

public string readTextFile(string fileName)
{
    Regex rgx = new Regex("");
    string txtFile = File.ReadAllText(fileName).ToUpper();

    txtFile = Regex.Replace(txtFile, @"(\s+)|([^A-Z ])", 
                m=> m.Groups[2].Success ? string.Empty : " ",
                RegexOptions.Multiline);
    return txtFile;
}

Upvotes: 1

Related Questions