Reputation: 549
What is the best way to extract the initials from a string (except for the last word)? For example convert "GEORGE SMITH BROGAN" to "GS BROGAN"
NAMES <- data.frame(ID = c("GEORGE SMITH BROGAN","ADAM STEVE WILLIS","UNITED INTERNATIONAL SHIPPING STATION")
The desired output for the above names would be GS BROGAN, AS WILLIS, UIS STATION.
Upvotes: 0
Views: 542
Reputation: 18691
Here is a different method using gsub
:
gsub('\\s(?![A-Z]+$)', '',
gsub('(?<!\\s|^)[A-Z]+\\s', ' ', NAMES$ID,
perl = TRUE), perl = TRUE)
# [1] "GS BROGAN" "AS WILLIS" "UIS STATION"
Upvotes: 0
Reputation: 887621
We can try with gsub
gsub("\\s+(?=[A-Z]\\b)", "",
gsub("\\b([A-Z])\\w+\\s|\\s(\\w+)$", "\\1 \\2", NAMES$ID), perl = TRUE)
#[1] "GS BROGAN" "AS WILLIS" "UIS STATION"
Or use strsplit
with paste
sapply(strsplit(as.character(NAMES$ID), "\\s+"),
function(x) paste(paste(substr(x[-length(x)], 1, 1), collapse=""),
x[length(x)]))
#[1] "GS BROGAN" "AS WILLIS" "UIS STATION"
Upvotes: 2