Link
Link

Reputation: 120

How to replace all given characters?

I'm trying to write a method that replaces all occurrences of the characters in the input array (charsToReplace) with the replacementCharacter using regex. The version I have written does not work if the array contains any characters that may change the meaning of the regex pattern, such as ']' or '^'.

public static string ReplaceAll(string str, char[] charsToReplace, char replacementCharacter)
{
    if(str.IsNullOrEmpty())
    {
        return string.Empty;
    }

    var pattern = $"[{new string(charsToReplace)}]";
    return Regex.Replace(str, pattern, replacementCharacter.ToString());
}

So ReplaceAll("/]a", {'/', ']' }, 'a') should return "aaa".

Upvotes: 3

Views: 204

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626689

Inside a character class, only 4 chars require escaping, ^, -, ] and \. You can't use Regex.Escape because it does not escape -and ] as they are not "special" outside a character class. Note that Regex.Escape is meant to be used only for literal char (sequences) that are outside character classes.

An unescaped ] char will close your character class prematurely and that is the main reason why your code does not work.

So, the fixed pattern variable definition can look like

var pattern = $"[{string.Concat(charsToReplace).Replace(@"\", @"\\").Replace("-", @"\-").Replace("^", @"\^").Replace("]", @"\]")}]";

See an online C# demo.

Upvotes: 1

Dmitrii Bychenko
Dmitrii Bychenko

Reputation: 186668

I suggest using Linq, not regular expresions:

using System.Linq;

...

public static string ReplaceAll(
  string str, char[] charsToReplace, char replacementCharacter) 
{
   // Please, note IsNullOrEmpty syntax
   // we should validate charsToReplace as well
   if (string.IsNullOrEmpty(str) || null == charsToReplace || charsToReplace.Length <= 0)
     return str; // let's just do nothing (say, not turn null into empty string)

   return string.Concat(str.Select(c => charsToReplace.Contains(c) 
     ? replacementCharacter
     : c));
}

If you insist on Regex (note, that we should Regex.Escape chars within charsToReplace). However, according to the manual Regex.Escape doesn't escape - and [ which have special meaning within regular expression brackets.

public static string ReplaceAll(
  string str, char[] charsToReplace, char replacementCharacter) {

  if (string.IsNullOrEmpty(str) || null == charsToReplace || charsToReplace.Length <= 0)
    return str;

  string charsSet = string.Concat(charsToReplace
    .Select(c => new char[] { ']', '-' }.Contains(c) // in case of '-' and ']'
       ? $@"\{c}"                                    // escape them as well
       : Regex.Escape(c.ToString())));

  return Regex.Replace(
     str,
    $"[{charsSet}]+",
     m => new string(replacementCharacter, m.Length));
}

Upvotes: 1

Related Questions