Guy Segal
Guy Segal

Reputation: 981

Struggling with regex code: CamelCase to camel_case

I was able to transform the string MyClassName to my_class_name using a regex

However my solution did not work with MyOtherTClassName, that should transform to my_other_t_class_name.

Also, this didn't work on ClassNumber1 either, which should be transformed into class_number_1

Without getting into my solution, that was not good enough, I would like help with the regex code that transforms:

  1. MyClassName -> my_class_name
  2. MyOtherTClassName -> my_other_t_class_name
  3. MyClassWith1Number -> my_class_with_1_number

Thanks,

Guy

Upvotes: 0

Views: 702

Answers (2)

Will
Will

Reputation: 534

I recently had this problem and the previous answer works if there's only one digit but if there are two or more consecutive digits they each would have an underscore preceding them. I used this to convert it for me in PHP.

strtolower(preg_replace('/(?<!^)([A-Z])|(?<![0-9])([0-9])/', '_$1$2', $string))

The regex, I believe, should be the same as C# so I'll break that down.

(?<!       # negative look behind
   ^       # beginning of string
)
([A-Z])    # one of capital letters
|          # or
(?<!       # negative look behind
   [0-9]   # one of digits
)
([0-9])    # one of digits

The idea is the same for letters. Make sure it's not the beginning of the string. For the digits just make sure that the previous character is not also a digit. We don't have to worry about it being the beginning of a string because the string won't start with a digit.

Upvotes: 0

KeyNone
KeyNone

Reputation: 9150

The logic behind is that you want to convert every capital letter to its lower-case variant and preceed it (and every number) with an underscore.
For example a T becomes _t, 6 becomes _6.
The only exception is the very first character. You don't want to preceed it with an undersoce. The regex will handle this case with a negative lookbehind in order to not match the first character.

//using System.Text.RegularExpression

//your input
string input = "MyOtherTClass1Name";

//the regex
string result = Regex.Replace(
    input, 
    "((?<!^)[A-Z0-9])", //the regex, see below for explanation
    delegate(Match m) { return "_" + m.ToString().ToLower(); }, //replace function
    RegexOptions.None
);
result = result.ToLower(); //one more time ToLower(); for the first character of the input

Console.WriteLine(result);

For the regex itself:

(           #start of capturing group
  (?<!      #negative lookbehind
     ^      #beginning of the string
  )         #end of lookbehind
  [A-Z0-9]  #one of A-Z or 0-9
)           #end of capturing group

So we capture every capital letter and every number (except for the very first character) and replace them with a lower-case variant of themselves combined with a preceeding underscore.

Upvotes: 6

Related Questions