Stefano
Stefano

Reputation: 179

Formula to split words by uppercase

I would like to come up with a script to program a custom formula for google sheet. The idea is to split a string composed of multiple words. The formula should recognize the words where there is a capital letter and separate them. The result would be a string where the words are separated by ",".

To clarify this is an example of the string:

Nursing StudentStudentNurseNursing School

Desired Result:
Nursing Student,Student,Nurse,Nursing School

I have tried to use a formula in Google Sheet:

=split(regexreplace(A1,"[A-Z][^A-Z]*","$0"&char(9)),char(9))

However, it generates 6 cells with the below strings:

Nursing Student Student Nurse Nursing School

Can anybody help me or give me some hint?

Upvotes: 3

Views: 8407

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626861

If you plan to insert a comma in between a lowercase letter and an uppercase letter, you may use either of:

=REGEXREPLACE(A1,"([a-z])([A-Z])","$1,$2")
=REGEXREPLACE(A1,"([[:lower:]])([[:upper:]])","$1,$2")

where

  • ([a-z]) / ([[:lower:]]) - Capturing group 1 (later referred to with $1 from the replacement pattern): any lowercase ASCII letter
  • ([A-Z]) / ([[:upper:]]) - Capturing group 2 (later referred to with $2 from the replacement pattern): any uppercase ASCII letter

enter image description here

Note that another suggestion, based on a non-word boundary \B, that can be written as =REGEXREPLACE(A1,"\B[A-Z]",",$0"), will also match an uppercase letter after _ and any digit, so it might overfire if you do not expect that behavior.

Upvotes: 3

TheMaster
TheMaster

Reputation: 50462

=REGEXREPLACE(A1,"(\B)([A-Z])",",$2")
  • \B not a word Border.
  • [A-Z] Upper case letter.
  • If \B is followed by a upper case letter, replace the \B with ,

Upvotes: 7

Related Questions