TheGeneral
TheGeneral

Reputation: 81483

Replace non alphanumeric characters and multiple spaces with just 1 space

I'm trying to replace all "non alphanumeric characters" and "multiple spaces" with just 1 single space

I have 2 working solutions, however i'm wondering if its possible to combine them efficiently?

Given

var str = "ASD S-DF 2134 4@# 4    234234 #$)(u SD";
var options = RegexOptions.None;

Solution for non alphanumeric characters

var rgxAlpha = new Regex("[^a-zA-Z0-9]");
str = rgxAlpha.Replace(str, " ", options);

Solution for multiple spaces

var regexSpace = new Regex(@"[ ]{2,}", options);
str = regexSpace.Replace(str, " ");

Expected Result

ASD S DF 2134 4 4 234234 u SD

Upvotes: 4

Views: 4307

Answers (4)

ToolmakerSteve
ToolmakerSteve

Reputation: 21213

For comparison to Regex answers, this is a code solution that doesn't use Regex. I almost never use Regex, but you can see that the Regex is quite simple compared to this.

[I haven't come up with a Linq alternative; I don't see a simple way to map the multi-character replacement rule to Linq.]

/// <summary> Alphanumeric characters are appended to result string.
/// One or more non-alphanumeric characters are replaced by ONE "replacement" character.
/// </summary>
static private string ReplaceNonAlphaNumeric(string str, char replacement = ' ')
{
    var sb = new System.Text.StringBuilder();

    // True when last character was replaced by "replacement" character.
    bool replacing = false;
    foreach (char cr in str)
    {
        if (char.IsLetterOrDigit(cr))
        {
            sb.Append(cr);
            replacing = false;
        }
        else
        {   // blank or other non-alphanumeric.
            if (replacing)
                continue;   // Only output one "replacement" character.

            sb.Append(replacement);
            replacing = true;
        }
    }

    return sb.ToString();
}

Usage: ReplaceNonAlphaNumeric(yourString)

Examples:

"a1b23c456" => "a1b23c456"
"a1   b23c 456" => "a1 b23c 456"
"a1-b23c! ?#456" => "a1 b23c 456"

OR: ReplaceNonAlphaNumeric(yourString, '-')

"a1   b23c 456" => "a1-b23c-456"
"a1-b23c! ?#456" => "a1-b23c-456"

Upvotes: 0

Saverio Terracciano
Saverio Terracciano

Reputation: 3915

Assuming they both work:

var rgxPattern = new Regex(@"[^a-zA-Z0-9]+|[ ]{2,}");

Just add a | between them.

Upvotes: 2

Gary Kaizer
Gary Kaizer

Reputation: 284

    string szOut= "";
    char previous;
    foreach (char cr in str.ToCharArray())
    {
            if(cr == ' '&& previous == ' ')
                continue;

            if (IsAlphaNumeric(cr))
                szOut+= cr;

            previous = cr;
    }       
    return szOut;

....

    public static Boolean IsAlphaNumeric(char cr)
    {
        return char.IsLetter(cr) || char.IsNumber(cr) || cr == ' ';
    }

Upvotes: 0

Avinash Raj
Avinash Raj

Reputation: 174696

Just the below would be enough, since [^a-zA-Z0-9]+ matches also the spaces, you don't need to add [ ]{2,} explicitly.

string result = Regex.Replace(str, @"[^a-zA-Z0-9]+", " ");

DEMO

Upvotes: 7

Related Questions