Deepa
Deepa

Reputation: 53

Looking for regex to split on the string on upper case basis

Looking for regex solution for following scenaio:

I have strings, which i have to split on the upper case basis but consecutive uppercase parts should not get split.

For example : if the input is

DisclosureOfComparativeInformation

O/p should be

Disclosure Of Comparative Information

But consecutive uppercase should not get split.

GAAP should not result in G A A P.

How to find the particular pattern and insert space?

Thanx

Upvotes: 5

Views: 3046

Answers (7)

jcadcell
jcadcell

Reputation: 811

In Perl this should work:

str =~ s/([A-Z][a-z])/ \1/g;

The parenthesis around the two character sets save the match for the "\1" (number one) later.

Upvotes: 0

onof
onof

Reputation: 17367

Split and Join:

string.Join(" ", Regex.Split("DisclosureOfComparativeInformation", @"([A-Z][a-z]*)"))

Upvotes: 0

sikender
sikender

Reputation: 5921

((?<=[a-z])[A-Z]|[A-Z](?=[a-z]))

replace with

" $1"

In a second step you'd have to trim the string.

check out this link also......

Regular expression, split string by capital letter but ignore TLA

Upvotes: 1

ipr101
ipr101

Reputation: 24236

Try -

var subjectString = "DisclosureOfComparativeInformation";
var resultString = Regex.Replace(subjectString, "([a-z])([A-Z])", "$1 $2");

Upvotes: 9

smitec
smitec

Reputation: 3049

[A-Z]{1}[a-z]+

will split as follows if replaced with match + space

DisclosureOfComparativeInformation -> Disclosure Of Comparative Information

GAPS -> GAPS

SOmething -> SOmething This one may be undesirable

alllower -> alllower

Upvotes: 1

Ingo
Ingo

Reputation: 36339

Using regex solutions to look for strings where something is not true tends to become unrecognizable. I'd recommend you go through your string in a loop and split it accordingly without using regexp.

Upvotes: 0

WiseGuyEh
WiseGuyEh

Reputation: 19080

Try this regex:

[a-z](?=[A-Z])

With this call to replace:

regex.Replace(toMatch, "$& ")

For more information on the special replacement symbol "$&", see http://msdn.microsoft.com/en-us/library/ewy2t5e0.aspx#EntireMatch

Upvotes: 1

Related Questions