Tyler Rinker
Tyler Rinker

Reputation: 110062

Fastest way to capitalize the first word in a string (base)

Using the base install functions what is the fastest way to capitalize the first letter in a vector of text strings?

I have provided a solution below but it seems to be a very inefficient approach (using substring and pasting it all together). I'm guessing there's a regex solution I'm not thinking of.

Once I have a few responses I'll benchmark them and report back the fastest solution using microbenchmarking.

Thank you in advance for your help.

x <- c("i like chicken.", "mmh so good", NA)
#desired output
[1] "I like chicken." "Mmh so good"     NA  

Upvotes: 4

Views: 253

Answers (4)

GSee
GSee

Reputation: 49830

I didn't time it, but I bet this is pretty fast

capitalize <- function(string) {
    #substring(string, 1, 1) <- toupper(substring(string, 1, 1))
    substr(string, 1, 1) <- toupper(substr(string, 1, 1))
    string
}
capitalize(x)
#[1] "I like chicken." "Mmh so good"     NA 

Upvotes: 5

huon
huon

Reputation: 102306

The Hmisc package contains a capitalize function:

> require(Hmisc)
> capitalize(c("i like chicken.", "mmh so good", NA))
[1] "I like chicken." "Mmh so good"     NA

(Although this appears to be slower than both the substring and regular expression versions.)

Upvotes: 3

guido
guido

Reputation: 19224

I think this will be slowest, but let it race against other solutions:

capitalize<-function(string) {
   sub("^(.)","\\U\\1", string, perl=TRUE )
}  

x <- c("i like chicken.", "mmh so good", NA)
capitalize(x)

EDIT: actually on ideone it is faster than substring

EDIT 2: matching any lowercase letter turns out to be slightly slower:

sub("^(\\p{Ll})","\\U\\1", string, perl=TRUE)

Upvotes: 4

Tyler Rinker
Tyler Rinker

Reputation: 110062

My solution using substring:

capitalize <- function(string) {
    cap <-  function(x) {
        if (is.na(x)) {
            NA
        }
        else {
            nc <- nchar(x)
            paste0(toupper(substr(x, 1, 1)), substr(x, 
              2, nc))
        }
    }
    sapply(string, cap, USE.NAMES = FALSE) 
}

x <- c("i like chicken.", "mmh so good", NA)
capitalize(x)

Upvotes: 1

Related Questions