Rudresha Parameshappa
Rudresha Parameshappa

Reputation: 3926

Use RegEx to uppercase and lowercase the string

I am trying to convert a string to uppercase and lowercase based on the index.

My string is a LanguageCode like cc-CC where cc is the language code and CC is the country code. The user can enter in any format like "cC-Cc". I am using the regular expression to match whether the data is in the format cc-CC.

var regex = new Regex("^[a-z]{2}-[A-Z]{2}$", RegexOptions.IgnoreCase); 
//I can use CultureInfos from .net framework and compare it's valid or not. 
//But the requirement is it should allow invalid language codes also as long 
//The enterd code is cc-CC format

Now when the user enters something cC-Cc I'm trying to lowercase the first two characters and then uppercase last two characters.

I can split the string using - and then concatenate them.

var languageDetails = languageCode.Split('-');
var languageCodeUpdated = $"{languageDetails[0].ToLowerInvariant()}-{languageDetails[1].ToUpperInvariant()}";

I thought can I avoid multiple strings creation and use RegEx itself to uppercase and lowercase accordingly.

While searching for the same I found some solutions to use \L and \U but I am not able to use them as the C# compiler showing error. Also, RegEx.Replace() has a parameter or delegate MatchEvaluator which I'm not able to understand.

Is there any way in C# we can use RegEx to replace uppercase with lowercase and vice versa.

Upvotes: 5

Views: 8202

Answers (2)

RoJaIt
RoJaIt

Reputation: 461

TLDR: This is Regex.Replace with \U and \L support.

    private static string EnhancedReplace(string input, string pattern, string replacement, RegexOptions options)
    {
        replacement = Regex.Replace(replacement, @"(?<mode>\\[UL])(?<group>\$((\d+)|({[^}]+})))", @"<!<mode:${mode}>%&${group}&%>");
        var output = Regex.Replace(input, pattern, replacement, options);
        output = Regex.Replace(output, @"<!<mode:\\L>%&(?<value>[\w\W]*?)&%>", x => x.Groups["value"].Value.ToLower());
        output = Regex.Replace(output, @"<!<mode:\\U>%&(?<value>[\w\W]*?)&%>", x => x.Groups["value"].Value.ToUpper());
        return output;
    }

How To Use

Call the function with \U followed by the group to be uppercase

var result = EnhancedReplace(input, @"(public \w+ )(\w)", @"$1\U$2", RegexOptions.None);

Will replace this:

public string test12 { get; set; } = "test3";

With that:

public string Test12 { get; set; } = "test3";

Details

I'm currently working on an app which allows the user to define a batch of Regex Replace operations. For example the user enters json and the batch converts it to a C#-Class. Therefore, speed is no key requirement. But it would be very handy to be able to use \U and \L. This method will apply Regex.Replace 3 times to the whole content and one time to the replacement string. Therefore it’s at least three times slower than Regex.Replace without \U \L support.

Step by Step

  1. The first Regex.Replace enhances the replacement string. It replaces: \U$1 with <!<mode:\\U>%&$1&%> (Also works for named groups: ${groupName})

  2. The new replacement will be applied to the content.

  3. & 4. The inserted placeholder is now relatively unique. That allows you to search only for <!<mode:\\U>%&Actual Value&%> and use the MatchEvaluator to replace it with its uppercase version. The same will be done for \L

Regex101 Demo:

Step 1: Enhance pattern with placeholder https://regex101.com/r/ZtqigN/1

Step 2 Use new replacement pattern https://regex101.com/r/PWLTFD/1

Step 3&4 Resolve new placeholders https://regex101.com/r/5DIIUo/1

Answer

var result = EnhancedReplace(input, @"(cc)(-)(cc)", @"\L$1$2\U$3", RegexOptions.IgnoreCase);

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626845

.NET regex does not support case modifying operators.

You may use MatchEvaluator:

var result = Regex.Replace(s, @"(?i)^([a-z]{2})-([a-z]{2})$", m => 
    $"{m.Groups[1].Value.ToLower()}-{m.Groups[2].Value.ToUpper()}");

See the C# demo.

Details

  • (?i) - the inline version of RegexOptions.IgnoreCase mopdiofier
  • ^ - start of the string
  • ([a-z]{2}) - Capturing group #1: 2 ASCII letters
  • - - a hyphen
  • ([a-z]{2}) - Capturing group #2: 2 ASCII letters
  • $ - end of string.

Upvotes: 10

Related Questions