Alex Zhukovskiy
Alex Zhukovskiy

Reputation: 10025

Regex replace multiple groups with wildcards

I found an answer when we do not use wildcard characters. So question is - how to perform multiple replaces by regex. This code shows what i want to do

internal class Program
{
    private static void Main(string[] args)
    {
        var rules = new Dictionary<string, string>
                    {
                        {@"F\S+", "Replace 1"},
                        {@"\S+z", "Replace 2"},
                    };

        string s = "Foo bar baz";
        string result = ProcessText(s, rules);
        Console.WriteLine(result);
    }

    private static string ProcessText(string input, Dictionary<string, string> rules)
    {
        string[] patterns = rules.Keys.ToArray();
        string pattern = string.Join("|", patterns);
        return Regex.Replace(input, pattern, match =>
                                             {
                                                 int index = GetMatchIndex(match);
                                                 return rules[patterns[index]];
                                             });
    }

    private static int GetMatchIndex(Match match)
    {
        int i = 0;
        foreach (Match g in match.Groups)
        {
            if (g.Success)
                return i;
            i++;
        }
        throw new Exception("Never throws");
    }
}

but match.Groups.Count is always 1.

I'm looking for the fastest alternative. Perhaps, it shouldn't use regexes.

Upvotes: 0

Views: 516

Answers (2)

RePierre
RePierre

Reputation: 9566

I don't understand why are you concatenating patterns and then performin so many searches in an array.

Can't you just apply each pattern individually like this?

var rules = new Dictionary<string, string>
                {
                    {@"F\S+", "Replace 1"},
                    {@"\S+z", "Replace 2"},
                };

string s = "Foo bar baz";
var result = rules.Aggregate(s, (seed, rule) => Regex.Replace(seed, rule.Key, m => rule.Value));

EDIT

Your match.Groups.Count is always one because there are no groups defined in your matches and the values is the entire matched string as described on MSDN. In other words, your GetMatchIndex method does nothing.

You could try transforming your patterns into named groups like this:

var patterns = rules.Select((kvp, index) => new 
{
    Key = String.Format("(?<{0}>{1})", index, kvp.Key),
    Value = kvp.Value
};

Having this array, in GetMatchIndex method you'd just parse the group name as being the index of the matched pattern:

private static int GetMatchIndex(Regex regex, Match match)
{
    foreach(var name in regex.GetGroupNames())
    {
        var group = match.Groups[name];
        if(group.Success)
            return int.Parse(name); //group name is a number
    }
    return -1;
}

Now, you can use it like this:

var pattern = String.Join("|", patterns.Select(x => x.Key));
var regex = new Regex(pattern);
return regex.Replace(input, pattern, m => 
{
    var index = GetMatchIndex(regex, m);
    return patterns[index].Value;
});

Upvotes: 1

smiech
smiech

Reputation: 750

To make it (probably way) faster extract

var keyArray = rules.Keys.ToArray()

before you use it in:

return Regex.Replace(input, pattern, match =>
{
    int index = GetMatchIndex(match);
    return rules[keyArray[index]];
});

Upvotes: 0

Related Questions