Reputation:
Initial vector is
"WhiteRiver" "microProcess" "PartsUnknown" "RedSox"
How do I split this to
White" "River" "micro" "Process" "Parts" "Unknown" "Red" "Sox"
The rule is to split between upper and lower case.
Upvotes: 2
Views: 81
Reputation: 17611
Using v1
like @akrun:
unlist(strsplit(sub("([a-z])([A-Z])", "\\1,\\2", v1), ","))
#[1] "White" "River" "micro" "Process" "Parts" "Unknown" "Red" "Sox"
Use gsub
instead of sub
if there are more than two words to split (e.g. WhiteRiverPark
)
Upvotes: 0
Reputation: 886948
If v1
is the vector
unlist(strsplit(v1, "(?<=[a-z])(?=[A-Z])", perl=TRUE))
#[1] "White" "River" "micro" "Process" "Parts" "Unknown" "Red"
#[8] "Sox"
(?<=[a-z])
lookbehind for lower case letters
(?=[A-Z])
lookahead for upper case letters
Upvotes: 5