Reputation: 641
I would like to capitalize the first letter of each word in a column, without converting remaining letters to lowercase. I am trying to use stringr
since its vectorized and plays well with dataframes, but would also use another solution. Below is a reprex showing my desired output and various attempts. I am able to select the first letter only, but then not sure how to capitalize it. Thank you for your help!
I also reviewed related posts, but wasn't sure how to apply those solutions in my case (i.e., within a dataframe):
Capitalize the first letter of both words in a two word string
library(dplyr)
library(stringr)
words <-
tribble(
~word, ~number,
"problems", 99,
"Answer", 42,
"golden ratio", 1.61,
"NOTHING", 0
)
# Desired output
new_words <-
tribble(
~word, ~number,
"Problems", 99,
"Answer", 42,
"Golden Ratio", 1.61,
"NOTHING", 0
)
# Converts first letter of each word to upper and all other to lower
mutate(words, word = str_to_title(word))
#> # A tibble: 4 x 2
#> word number
#> <chr> <dbl>
#> 1 Problems 99
#> 2 Answer 42
#> 3 Golden Ratio 1.61
#> 4 Nothing 0
# Some attempts
mutate(words, word = str_replace_all(word, "(?<=^|\\s)([a-zA-Z])", "X"))
#> # A tibble: 4 x 2
#> word number
#> <chr> <dbl>
#> 1 Xroblems 99
#> 2 Xnswer 42
#> 3 Xolden Xatio 1.61
#> 4 XOTHING 0
mutate(words, word = str_replace_all(word, "(?<=^|\\s)([a-zA-Z])", "\\1"))
#> # A tibble: 4 x 2
#> word number
#> <chr> <dbl>
#> 1 problems 99
#> 2 Answer 42
#> 3 golden ratio 1.61
#> 4 NOTHING 0
Created on 2021-07-26 by the reprex package (v2.0.0)
Upvotes: 7
Views: 5342
Reputation: 78937
We could use str_to_title
function from stringr
package.
The problem is that NOTHING
turns to Nothing
.
But we can overcome this with an ifelse
-> checking if first character is uppercase then leaf else make uppercase.
library(dplyr)
library(stringr)
words %>%
mutate(word = ifelse(str_detect(word, "^[:upper:]+$"), word,str_to_title(word)))
Output:
word number
<chr> <dbl>
1 Problems 99
2 Answer 42
3 Golden Ratio 1.61
4 NOTHING 0
Upvotes: 3
Reputation: 6230
According to https://community.rstudio.com/t/is-there-will-there-be-perl-support-in-stringr/38016/3 stringr uses stringi and the ICU engine, so it does not and will not support perl type regex (which is what enables the \U\1 part in other answers). So you should use the gsub with perl=TRUE answer by @Tim.
Upvotes: 1
Reputation: 521467
Here is a base R solution using gsub
:
words$word <- gsub("\\b([a-z])", "\\U\\1", words$word, perl=TRUE)
This will replace the first lowercase letter of every word with its uppercase version. Note that the \b
word boundary will match a lowercase preceded by either whitespace or the start of the column's value.
Upvotes: 6