user3489562
user3489562

Reputation: 249

Formatting species names in a column in R

I am working with quite a large database containing a column called 'Species_name' this is a factor column and includes the names of around 40 different species. As R is often case sensitive (particularly when plotting graphs) I was wondering if it was possible to write a line of code which formats all the species names in this column to Capital then lower case i.e. Brown crab, Blonde ray etc.

Apologies for my ignorance - I am new to R!

Many thanks!

Upvotes: 2

Views: 1005

Answers (3)

David Arenburg
David Arenburg

Reputation: 92292

levels(df$Species_name) <- gsub("^([a-z])", "\\U\\1", tolower(levels(df$Species_name)), perl = TRUE)

Explanaiton:

First, make all names lower case using tolower, then capitalize first letter using gsub.

^([a-z]) goes after the first letter, while \\U\\1 means to capitalize it in Perl, thus the perl = TRUE

Upvotes: 0

bartektartanus
bartektartanus

Reputation: 16080

Use functions from stringi package:

require(stringi)
x <- "alA Ma KOTA 123"
stri_join(stri_trans_toupper(stri_sub(x,1,1)),stri_trans_tolower(stri_sub(x,2)))
## [1] "Ala ma kota 123"

I think is worth mentioning that there is function which transform string to Title Case, but not in way that you are looking for.

stri_trans_totitle(x)
## [1] "Ala Ma Kota 123"

Upvotes: 0

ilir
ilir

Reputation: 3224

You first need to define a function that transforms character values to the case you want. R has built in tolower and toupper but nothing that capitalizes them the way you want.

capitalize <- function(x){
  first <- toupper(substr(x, start=1, stop=1)) ## capitalize first letter
  rest <- tolower(substr(x, start=2, stop=nchar(x)))   ## everything else lowercase
  paste0(first, rest)
}

Then you only apply the function to the levels of your factor variable. That's one advantage of factors:

levels(data$Species_name) <- capitalize(levels(data$Species_name))

Upvotes: 1

Related Questions