Reputation: 48696
I am trying to use RegEx.Replace to convert a string into Pascal case. RegEx is not necessary, but I thought that maybe it'll be easier. Here are some example test cases I'm trying to convert:
simple simon says => SimpleSimonSays
SIMPLE SIMON SaYs => SimpleSimonSays
simple_simon_says => SimpleSimonSays
simple simon says => SimpleSimonSays
simpleSimonSays => SimpleSimonSays
simple___simon___ says => SimpleSimonSays
The method I currently have doesn't use RegEx and it works correctly on 4 of the 5 examples above:
internal static string GetPascalCaseName(string name)
{
string s = System.Globalization.CultureInfo.CurrentCulture.
TextInfo.ToTitleCase(name.ToLower()).Replace(" ", "").Replace("_", "");
return s;
}
The one example that fails is simpleSimonSays
. It currently returns Simplesimonsays
instead of SimpleSimonSays
. How can I make this work on all 4 scenarios?
EDIT
So basically, words are distinguished if there are spaces seperating them, or underscores, or whenever an upper-case character is reached. Also, multiple spaces and/or multiple underscores should be treated as one. Basically spaces and underscores should just be ignored and used as a signal that the next letter should be a capital letter. Like this:
simple_____simon___ says => SimpleSimonSays
Upvotes: 4
Views: 5032
Reputation: 33
If can be version like "abc simpleSimonSays" then it's impossible. Or need to add more rules. Or things like deep learning :)
EDIT:
possible code (but without "abc simpleSimonSays"):
var s = "simple__simon_says __ Hi _ _,,, __coolWa";
var s1 = Regex.Replace(s, "[ _,]+", " ");
var s2 = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(s1);
var s3 = s2.Replace(" ","");
// s1 = "simple simon says Hi coolWa"
// s2 = "Simple Simon Says Hi Coolwa"
// s3 = "SimpleSimonSaysHiCoolwa"
Upvotes: 0
Reputation: 18357
I have a trick for solving your problem. Using regex, split the word and introduce a space within word for words where there is no space or underscore, that are camel case (like this simpleSimonSays). Modify your method to this,
internal static string GetPascalCaseName(string name)
{
if (!name.Contains(" ")) {
name = Regex.Replace(name, "(?<=[a-z])(?=[A-Z])", " ");
}
string s = System.Globalization.CultureInfo.CurrentCulture.
TextInfo.ToTitleCase(name.ToLower()).Replace(" ", "").Replace("_", "");
return s;
}
This new line in your method,
name = Regex.Replace(name, "(?<=[a-z])(?=[A-Z])", " ");
splits the camel case word by introducing a space between them, making them like others where you had no difficultly.
For this input,
simpleSimonSays
It outputs this,
SimpleSimonSays
And for rest of the input, it works anyway. This strategy will work even for words where you have partially camel case and partially space or underscore too.
Upvotes: 1
Reputation: 34429
Here is solution without Regex. The last one cannot be done.
string[] input = {
"simple simon says",
"SIMPLE SIMON SaYs",
"simple_simon_says",
"simple simon says",
"simpleSimonSays"
};
var temp = input.Select(x => x.Split(new char[] {' ', '_'}, StringSplitOptions.RemoveEmptyEntries).Select(y => y.Select((z,i) => (i == 0) ? z.ToString().ToUpper() : z.ToString().ToLower()))).ToArray();
string[] output = temp.Select(x => string.Join("", x.Select(y => string.Join("",y)))).ToArray();
Upvotes: 0