Reputation: 5491
I am trying to get the first upper and lower letter from each word from a string.
string<-"Programmation _ Is 2 Cool"
gsub("[^A-Z]", "", string)
gsub("[^A-Za-z]", "", string)
The two results are :
"PIC"
"ProgrammationIsCool"
I would like to get :
"PrIsCo"
Thanks for help
Upvotes: 2
Views: 1913
Reputation: 627100
If the first uppercase and the next lowercase letters must be extracted, use
(\\b[A-Z][a-z])|.
or
(\\b\\p{Lu}\\p{Ll})|.
The idea is to match and capture first uppercase and the following lowercase letters, and remove all the rest.
gsub("(\\b[A-Z][a-z])|.", "\\1", string, perl=TRUE)
Note that to remove newlines, you will need to pre-pend (?s)
to the beginning of the pattern.
Pattern details:
(\\b[A-Z][a-z])
- Group 1 matching
\\b
- a word boundary[A-Z][a-z]
- An uppercase ASCII letter followed with a lowercase ASCII letter (replace with \\p{Lu}\\p{Ll}
to match any Unicode uppercase-lowercase letters).|
- or.
- any character but a newlineUpvotes: 4