petergensler
petergensler

Reputation: 342

Change select columns from character to integers

I have a dataframe as such:

sample = data.frame(
  beer_brewerId = c("8481", "8481", "8481"),
  rev_app = c("4/5","1/5", "2/5"),
  beer_name = c("John Harvards Simcoe IPA", "John Harvards Simcoe IPA", "John Harvards American Brown Ale"),
  review_taste =c("6/10", "7/10", "6/10"), stringsAsFactors = FALSE
)
str(sample)

I would like to convert only columns 2 and 4 from a character vector into an integer for analysis purposes. Normally, this would not be so difficult if all of the character columns I have I want to convert to numeric with the following code, but this does not work as I want to keep column 3 as a chr type:

sample %>%
  select(2,4) %>%
  mutate_if(is.character, as.numeric)

You can easily accomplish this with base r as:

#base approach
cols <- c("2","4")
data[cols] <- lapply(data[cols], as.numeric)

Is there an easy way to do this using dplyr, and even within a pipe sequence? If you were to select only certain columns using select(), it would not allow you to save the results back into the dataframe

Something like this would work, but as my dataset has 15+ columns, this seems like its very cumbersome code:

cleandf <- sample %>%
#Use transform or mutate to convert each column manually
transform(rev_app = as.integer(rev_app)) %>%
transform(review_taste = as.integer(review_taste))

Is mutate_at, or mutate_each meant to perform this task? Any help would be appreciated. Thanks.

#Maybe something like this:
cols <- c("2","4")
data %>%
  mutate_each(is.character[cols], as.numeric)

Upvotes: 1

Views: 11091

Answers (2)

petergensler
petergensler

Reputation: 342

Easiest way to accomplish this is through using mutate_at with the specified column indexes:

sample <- sample %>%
  #Do normal mutations on the data
  mutate(rev_app =  str_replace_all(rev_app, "/5", "")) %>%      
  mutate(review_taste =  str_replace_all(review_taste, "/10", "")) %>%

  #Now add this one-liner onto your chain
  mutate_at(c(2,4), as.numeric) %>%
  glimpse(., n=5)

Upvotes: 4

jess
jess

Reputation: 534

You can achieve this with the mutate_at function.

sample = data.frame(beer_brewerId = c("8481", "8481", "8481"),
     rev_app = c("4/5","1/5", "2/5"),
     beer_name = c("John Harvards Simcoe IPA", "John Harvards Simcoe IPA", "John Harvards American Brown Ale"),
     review_taste =c("6/10", "7/10", "6/10"), stringsAsFactors = FALSE)

# get rid of "/"
clean <- function(foo) {
         sapply(foo, function(x) eval(parse(text = x)))
         }
# you can replace c(2,4) by whatever columns you need
clean_sample <- sample %>% 
                mutate_at(c(2,4), clean)

Columns 2 and 4 are now numeric.

Upvotes: 2

Related Questions