Thigers
Thigers

Reputation: 33

How to capitalize all but some letters in R

I have a dataframe in R with a column of strings, e.g. v1 <- c('JaStADmmnIsynDK', 'laUksnDTusainS')

My goal is to capitalize all letters in each string except 's', 't' and 'y'.

So the result should end up being: 'JAStADMMNIsyNDK' and 'LAUKsNDTUsAINS'.

Thus not changing any of the said letters: 's', 't' and 'y'.

As of now I do it by simply having 25x

levels(df$strings) <- sub('n', 'N', levels(df$strings))

But that seems to be overkill! How can I do this easily in R?

Upvotes: 3

Views: 1221

Answers (3)

smci
smci

Reputation: 33940

We can directly gsub() an uppercase replacement on each applicable lowercase letter, using the perl '\U' operator on the '\1' capture group (which @Akrun reminded of):

v1 <- c("JaStADmmnIsynDK", "laUksnDTusainS")
gsub('([a-ru-xz])', '\\U\\1'), v1, perl = TRUE)
"JAStADMMNIsyNDK" "LAUKsNDTUsAINS"

Upvotes: 1

Rohit Das
Rohit Das

Reputation: 2032

The answer posted by @akrun is indeed brilliant. But here is my more direct brute force approach which I finished too late.

s <- "JaStADmmnIsynDK"

customUpperCase <- function(s,ignore = c("s","t","y")) {
  u <- sapply(unlist(strsplit(s,split = "")),
              function(x) if(!(x %in% ignore)) toupper(x) else x )
  paste(u,collapse = "")
}

customUpperCase(s)
#[1] "JAStADMMNIsyNDK"

Upvotes: 1

akrun
akrun

Reputation: 887028

Try

v2 <- gsub("[sty]", "", paste(letters, collapse="")) 
chartr(v2, toupper(v2), v1)
#[1] "JAStADMMNIsyNDK" "LAUKsNDTUsAINS" 

data

v1 <- c("JaStADmmnIsynDK", "laUksnDTusainS")

Upvotes: 6

Related Questions