Icemanind
Icemanind

Reputation: 48696

Converting string to Pascal Case using RegEx

I am trying to use RegEx.Replace to convert a string into Pascal case. RegEx is not necessary, but I thought that maybe it'll be easier. Here are some example test cases I'm trying to convert:

simple simon says       => SimpleSimonSays
SIMPLE SIMON SaYs       => SimpleSimonSays
simple_simon_says       => SimpleSimonSays
simple    simon    says => SimpleSimonSays
simpleSimonSays         => SimpleSimonSays
simple___simon___  says => SimpleSimonSays

The method I currently have doesn't use RegEx and it works correctly on 4 of the 5 examples above:

internal static string GetPascalCaseName(string name)
{
    string s = System.Globalization.CultureInfo.CurrentCulture.
               TextInfo.ToTitleCase(name.ToLower()).Replace(" ", "").Replace("_", "");

    return s;
}

The one example that fails is simpleSimonSays. It currently returns Simplesimonsays instead of SimpleSimonSays. How can I make this work on all 4 scenarios?

EDIT

So basically, words are distinguished if there are spaces seperating them, or underscores, or whenever an upper-case character is reached. Also, multiple spaces and/or multiple underscores should be treated as one. Basically spaces and underscores should just be ignored and used as a signal that the next letter should be a capital letter. Like this:

simple_____simon___   says => SimpleSimonSays

Upvotes: 4

Views: 5032

Answers (3)

AndrewF
AndrewF

Reputation: 33

If can be version like "abc simpleSimonSays" then it's impossible. Or need to add more rules. Or things like deep learning :)
EDIT:
possible code (but without "abc simpleSimonSays"):

var s = "simple__simon_says __ Hi _ _,,, __coolWa";

var s1 = Regex.Replace(s, "[ _,]+", " ");
var s2 = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(s1);
var s3 = s2.Replace(" ","");

// s1 = "simple simon says Hi coolWa"
// s2 = "Simple Simon Says Hi Coolwa"
// s3 = "SimpleSimonSaysHiCoolwa"

Upvotes: 0

Pushpesh Kumar Rajwanshi
Pushpesh Kumar Rajwanshi

Reputation: 18357

I have a trick for solving your problem. Using regex, split the word and introduce a space within word for words where there is no space or underscore, that are camel case (like this simpleSimonSays). Modify your method to this,

internal static string GetPascalCaseName(string name)
{
    if (!name.Contains(" ")) {
        name = Regex.Replace(name, "(?<=[a-z])(?=[A-Z])", " ");
    }
    string s = System.Globalization.CultureInfo.CurrentCulture.
               TextInfo.ToTitleCase(name.ToLower()).Replace(" ", "").Replace("_", "");

    return s;
}

This new line in your method,

name = Regex.Replace(name, "(?<=[a-z])(?=[A-Z])", " ");

splits the camel case word by introducing a space between them, making them like others where you had no difficultly.

For this input,

simpleSimonSays

It outputs this,

SimpleSimonSays

And for rest of the input, it works anyway. This strategy will work even for words where you have partially camel case and partially space or underscore too.

Upvotes: 1

jdweng
jdweng

Reputation: 34429

Here is solution without Regex. The last one cannot be done.

            string[] input = {
                "simple simon says",
                "SIMPLE SIMON SaYs",
                "simple_simon_says",
                "simple    simon    says",
                "simpleSimonSays"
                             };

            var temp = input.Select(x => x.Split(new char[] {' ', '_'}, StringSplitOptions.RemoveEmptyEntries).Select(y => y.Select((z,i) => (i == 0) ? z.ToString().ToUpper() : z.ToString().ToLower()))).ToArray();
            string[] output = temp.Select(x => string.Join("", x.Select(y => string.Join("",y)))).ToArray();

Upvotes: 0

Related Questions