Reputation: 342
I have a dataframe as such:
sample = data.frame(
beer_brewerId = c("8481", "8481", "8481"),
rev_app = c("4/5","1/5", "2/5"),
beer_name = c("John Harvards Simcoe IPA", "John Harvards Simcoe IPA", "John Harvards American Brown Ale"),
review_taste =c("6/10", "7/10", "6/10"), stringsAsFactors = FALSE
)
str(sample)
I would like to convert only columns 2 and 4 from a character vector into an integer for analysis purposes. Normally, this would not be so difficult if all of the character columns I have I want to convert to numeric with the following code, but this does not work as I want to keep column 3 as a chr type:
sample %>%
select(2,4) %>%
mutate_if(is.character, as.numeric)
You can easily accomplish this with base r as:
#base approach
cols <- c("2","4")
data[cols] <- lapply(data[cols], as.numeric)
Is there an easy way to do this using dplyr, and even within a pipe sequence? If you were to select only certain columns using select()
, it would not allow you to save the results back into the dataframe
Something like this would work, but as my dataset has 15+ columns, this seems like its very cumbersome code:
cleandf <- sample %>%
#Use transform or mutate to convert each column manually
transform(rev_app = as.integer(rev_app)) %>%
transform(review_taste = as.integer(review_taste))
Is mutate_at
, or mutate_each
meant to perform this task? Any help would be appreciated. Thanks.
#Maybe something like this:
cols <- c("2","4")
data %>%
mutate_each(is.character[cols], as.numeric)
Upvotes: 1
Views: 11091
Reputation: 342
Easiest way to accomplish this is through using mutate_at with the specified column indexes:
sample <- sample %>%
#Do normal mutations on the data
mutate(rev_app = str_replace_all(rev_app, "/5", "")) %>%
mutate(review_taste = str_replace_all(review_taste, "/10", "")) %>%
#Now add this one-liner onto your chain
mutate_at(c(2,4), as.numeric) %>%
glimpse(., n=5)
Upvotes: 4
Reputation: 534
You can achieve this with the mutate_at
function.
sample = data.frame(beer_brewerId = c("8481", "8481", "8481"),
rev_app = c("4/5","1/5", "2/5"),
beer_name = c("John Harvards Simcoe IPA", "John Harvards Simcoe IPA", "John Harvards American Brown Ale"),
review_taste =c("6/10", "7/10", "6/10"), stringsAsFactors = FALSE)
# get rid of "/"
clean <- function(foo) {
sapply(foo, function(x) eval(parse(text = x)))
}
# you can replace c(2,4) by whatever columns you need
clean_sample <- sample %>%
mutate_at(c(2,4), clean)
Columns 2 and 4 are now numeric.
Upvotes: 2