Reputation: 7755
I would like to capitalize everything in a character vector that comes after the first _
. For example the following vector:
x <- c("NYC_23df", "BOS_3_rb", "mgh_3_3_f")
Should come out like this:
"NYC_23DF" "BOS_3_RB" "mgh_3_3_F"
I have been trying to play with regular expressions, but am not able to do this. Any suggestions would be appreciated.
Upvotes: 17
Views: 4910
Reputation: 270348
gsubfn
in the gsubfn package is like gsub
except the replacement string can be a function. Here we match _ and everything afterwards feeding the match through toupper
:
library(gsubfn)
gsubfn("_.*", toupper, x)
## [1] "NYC_23DF" "BOS_3_RB" "mgh_3_3_F"
Note that this approach involves a particularly simple regular expression.
Upvotes: 14
Reputation: 226951
You were very close:
gsub("(_.*)","\\U\\1",x,perl=TRUE)
seems to work. You just needed to use _.*
(underscore followed by zero or more other characters) rather than _*
(zero or more underscores) ...
To take this apart a bit more:
_.*
gives a regular expression pattern that matches an underscore _
followed by any number (including 0) of additional characters; .
denotes "any character" and *
denotes "zero or more repeats of the previous element"()
denotes that it is a pattern we want to store\\1
in the replacement string says "insert the contents of the first matched pattern", i.e. whatever matched _.*
\\U
, in conjunction with perl=TRUE
, says "put what follows in upper case" (uppercasing _
has no effect; if we wanted to capitalize everything after (for example) a lower-case g, we would need to exclude the g from the stored pattern and include it in the replacement pattern: gsub("g(.*)","g\\U\\1",x,perl=TRUE)
)For more details, search for "replacement" and "capitalizing" in ?gsub
(and ?regexp
for general information about regular expressions)
Upvotes: 27
Reputation: 13100
base::strsplit
x <- c("NYC_23df", "BOS_3_rb", "mgh_3_3_f", "a")
myCap <- function(x) {
out <- sapply(x, function(y) {
temp <- unlist(strsplit(y, "_"))
out <- temp[1]
if (length(temp[-1])) {
out <- paste(temp[1], paste(toupper(temp[-1]),
collapse="_"), sep="_")
}
return(out)
})
out
}
> myCap(x)
NYC_23df BOS_3_rb mgh_3_3_f a
"NYC_23DF" "BOS_3_RB" "mgh_3_3_F" "a"
pkg <- "stringr"
if (!require(pkg, character.only=TRUE)) {
install.packages(pkg)
require(pkg, character.only=TRUE)
}
myCap.2 <- function(x) {
out <- sapply(x, function(y) {
idx <- str_locate(y, "_")
if (!all(is.na(idx[1,]))) {
str_sub(y, idx[,1], nchar(y)) <- toupper(str_sub(y, idx[,1], nchar(y)))
}
return(y)
})
out
}
> myCap.2(x)
NYC_23df BOS_3_rb mgh_3_3_f a
"NYC_23DF" "BOS_3_RB" "mgh_3_3_F" "a"
Upvotes: 4