Reputation: 445
I have many dataframes where all the data is character. I can guess that a var containing a number should be changed to a numeric data type. I have 100's of columns though so I don't want to type out each each one to change in order to change it. Is there another way to automate this process and to scan a column of data check if the character has a numeric value and change it into a numeric type from character type?
employee <- c('John Doe','Peter Gynn','Jolie Hope')
salary <- c("21000", "23400", "26800")
gender <- c("M", "M", "F")
rank <- c("5", "109", "2")
df <- data.frame(employee, salary, gender, rank)
I don't want to have to do this for each column/var
df$rank <- as.numeric(df$rank)
I would like to do something like this
i <- sapply(df, is.vector.of.columns.contaning.numeric.values)
df[i] <- lapply(df[i], as.numeric)
Upvotes: 1
Views: 706
Reputation: 28441
We can write a function with the number condition. It works by trying as.numeric
and checking if it returns NA
, if it does, that means the value cannot be coerced to an unambiguous numeric. When this happens, the function will keep the column as is.
smartConvert <- function(x) {
if(any(is.na(as.numeric(as.character(x))))) x else as.numeric(x)
}
df[] <- lapply(df, smartConvert)
str(df)
# 'data.frame': 3 obs. of 4 variables:
# $ employee: Factor w/ 3 levels "John Doe","Jolie Hope",..: 1 3 2
# $ salary : num 1 2 3
# $ gender : Factor w/ 2 levels "F","M": 2 2 1
# $ rank : num 3 1 2
Upvotes: 3