Jess
Jess

Reputation: 227

Regular Expression to split uppercase, uppercase, lowercase pattern in Groovy

I am trying to split a camelcase string into individual words while keeping strings of capital letters together. For example, "fooBarABABFooBar" should become "foo bar ABAB foo bar". There are a few requirements. Abbreviations like "ABAB" should remain capitalized, but the first letter of the other words should be lowercase. I've had some luck breaking apart the camel case using the following regular expression:

def str = "fooBarABABFooBar"
println str.replaceAll(/(?<=[a-z])(?=[A-Z])/) { ' ' + it }

This gets me "foo Bar ABABFoo Bar". I've been able to go from this to "foo Bar A B A B Foo Bar," but not to the desired output. Any ideas? Thanks!

Upvotes: 1

Views: 1838

Answers (1)

Ibrahim Najjar
Ibrahim Najjar

Reputation: 19423

Try the following expression:

(?=[A-Z][a-z])|(?<=[a-z])(?=[A-Z])

As you can see I have used your original Idea and modified it a bit. I noticed that in your expression (?<=[a-z])(?=[A-Z]) you tried to match positions where there is a lowercase letter followed by an uppercase letter, OK great job.

Now I have gone further with this idea and noticed that there are other important positions too, namely (?=[A-Z][a-z]) or in other words: match positions where there is an uppercase followed by lower case such as Foo because this is probably camel case situation.

My expression matches the following positions:

foo BarABAB Foo Bar
   ^       ^   ^

Notice there is one position left, and now comes the turn for your expression which mathces the following positions:

foo Bar ABAB Foo Bar
       ^

So when my expression fails, your succeeds and vice versa. So now the two of them together match:

foo Bar ABAB Foo Bar
   ^   ^    ^   ^

Upvotes: 1

Related Questions