Reputation: 53
Looking for regex solution for following scenaio:
I have strings, which i have to split on the upper case basis but consecutive uppercase parts should not get split.
For example : if the input is
DisclosureOfComparativeInformation
O/p should be
Disclosure Of Comparative Information
But consecutive uppercase should not get split.
GAAP
should not result in G A A P
.
How to find the particular pattern and insert space?
Thanx
Upvotes: 5
Views: 3046
Reputation: 811
In Perl this should work:
str =~ s/([A-Z][a-z])/ \1/g;
The parenthesis around the two character sets save the match for the "\1" (number one) later.
Upvotes: 0
Reputation: 17367
Split and Join:
string.Join(" ", Regex.Split("DisclosureOfComparativeInformation", @"([A-Z][a-z]*)"))
Upvotes: 0
Reputation: 5921
((?<=[a-z])[A-Z]|[A-Z](?=[a-z]))
replace with
" $1"
In a second step you'd have to trim the string.
check out this link also......
Regular expression, split string by capital letter but ignore TLA
Upvotes: 1
Reputation: 24236
Try -
var subjectString = "DisclosureOfComparativeInformation";
var resultString = Regex.Replace(subjectString, "([a-z])([A-Z])", "$1 $2");
Upvotes: 9
Reputation: 3049
[A-Z]{1}[a-z]+
will split as follows if replaced with match + space
DisclosureOfComparativeInformation -> Disclosure Of Comparative Information
GAPS -> GAPS
SOmething -> SOmething
This one may be undesirable
alllower -> alllower
Upvotes: 1
Reputation: 36339
Using regex solutions to look for strings where something is not true tends to become unrecognizable. I'd recommend you go through your string in a loop and split it accordingly without using regexp.
Upvotes: 0
Reputation: 19080
Try this regex:
[a-z](?=[A-Z])
With this call to replace:
regex.Replace(toMatch, "$& ")
For more information on the special replacement symbol "$&", see http://msdn.microsoft.com/en-us/library/ewy2t5e0.aspx#EntireMatch
Upvotes: 1