neversaint
neversaint

Reputation: 63984

How to change stringified numbers in data frame into pure numeric values in R

I have the following data.frame:

employee <- c('John Doe','Peter Gynn','Jolie Hope')
# Note that the salary below is in stringified format.
# In reality there are more  such stringified numerical columns.
salary <- as.character(c(21000, 23400, 26800))
df <- data.frame(employee,salary)

The output is:

> str(df)
'data.frame':   3 obs. of  2 variables:
 $ employee: Factor w/ 3 levels "John Doe","Jolie Hope",..: 1 3 2
 $ salary  : Factor w/ 3 levels "21000","23400",..: 1 2 3

What I want to do is to convert the change the value from string into pure number straight fro the df variable. At the same time preserve the string name for employee. I tried this but won't work:

as.numeric(df)

At the end of the day I'd like to perform arithmetic on these numeric values from df. Such as df2 <- log2(df), etc.

Upvotes: 1

Views: 684

Answers (1)

Marius
Marius

Reputation: 60060

Ok, there's a couple of things going on here:

  • R has two different datatypes that look like strings: factor and character
  • You can't modify most R objects in place, you have to change them by assignment

The actual fix for your example is:

df$salary = as.numeric(as.character(df$salary))

If you try to call as.numeric on df$salary without converting it to character first, you'd get a somewhat strange result:

> as.numeric(df$salary)
[1] 1 2 3

When R creates a factor, it turns the unique elements of the vector into levels, and then represents those levels using integers, which is what you see when you try to convert to numeric.

Upvotes: 4

Related Questions