Spruce Island
Spruce Island

Reputation: 445

R check character values for numeric and change var datatype automatically

I have many dataframes where all the data is character. I can guess that a var containing a number should be changed to a numeric data type. I have 100's of columns though so I don't want to type out each each one to change in order to change it. Is there another way to automate this process and to scan a column of data check if the character has a numeric value and change it into a numeric type from character type?

employee <- c('John Doe','Peter Gynn','Jolie Hope')
salary <- c("21000", "23400", "26800")
gender <- c("M", "M", "F")
rank <- c("5", "109", "2")

df <- data.frame(employee, salary, gender, rank)

I don't want to have to do this for each column/var

df$rank <- as.numeric(df$rank)

I would like to do something like this

i <- sapply(df, is.vector.of.columns.contaning.numeric.values)
df[i] <- lapply(df[i], as.numeric)

Upvotes: 1

Views: 706

Answers (1)

Pierre L
Pierre L

Reputation: 28441

We can write a function with the number condition. It works by trying as.numeric and checking if it returns NA, if it does, that means the value cannot be coerced to an unambiguous numeric. When this happens, the function will keep the column as is.

smartConvert <-  function(x) {
  if(any(is.na(as.numeric(as.character(x))))) x else as.numeric(x)
}

df[] <- lapply(df, smartConvert)
str(df)
# 'data.frame': 3 obs. of  4 variables:
#  $ employee: Factor w/ 3 levels "John Doe","Jolie Hope",..: 1 3 2
#  $ salary  : num  1 2 3
#  $ gender  : Factor w/ 2 levels "F","M": 2 2 1
#  $ rank    : num  3 1 2

Upvotes: 3

Related Questions