user3764639
user3764639

Reputation:

Split strings using a pattern

Initial vector is

"WhiteRiver"   "microProcess" "PartsUnknown" "RedSox"

How do I split this to

White"   "River"   "micro"   "Process" "Parts"   "Unknown" "Red"  "Sox"

The rule is to split between upper and lower case.

Upvotes: 2

Views: 81

Answers (2)

Jota
Jota

Reputation: 17611

Using v1 like @akrun:

unlist(strsplit(sub("([a-z])([A-Z])", "\\1,\\2", v1), ","))
#[1] "White"   "River"   "micro"   "Process" "Parts"   "Unknown" "Red"     "Sox"

Use gsub instead of sub if there are more than two words to split (e.g. WhiteRiverPark)

Upvotes: 0

akrun
akrun

Reputation: 886948

If v1 is the vector

unlist(strsplit(v1, "(?<=[a-z])(?=[A-Z])", perl=TRUE))
#[1] "White"   "River"   "micro"   "Process" "Parts"   "Unknown" "Red"    
#[8] "Sox"    

(?<=[a-z]) lookbehind for lower case letters

(?=[A-Z]) lookahead for upper case letters

Upvotes: 5

Related Questions