Bob
Bob

Reputation: 516

Issue with number values importing csv files in R

As usually, I am importing a .csv file from Excel. Since I will be performing some econometric regressions, I m not importing just the values, but also some columns with labels.

df <- read.csv("peasantsworkalot.csv", header=TRUE)

where the df looks like the following

country <- c("AT", "AT", "AT", "AT")
code <- c("AT1", "AT1", "AT2", "AT2")
c <- c("Village1", "Village1", "Village2", "Village2")
d <- c("Year1", "Year1", "Year2", "Year2")
e <- c(65322.09, 62322.01, 84561.06, 86000.02)
df <- cbind(country,code,c,d,e)
df

[1,] "AT" "AT1" "Village1" "Year1" "65322.09"
[2,] "AT" "AT1" "Village1" "Year1" "62322.01"
[3,] "AT" "AT2" "Village2" "Year2" "84561.06"
[4,] "AT" "AT2" "Village2" "Year2" "86000.02"

Whenever I try to make any kind of operation with the values in the e column, I got the following message:

[1] NA
Warning message:
In Ops.factor( ):
  + not meaningful for factors

I suppose that, for somewhat reason it reads the values as non numeric. Therefore I tried

as.numeric(df) 

or

as.numeric(df[,5])

The first does not work and gives

Error: (list) object cannot be coerced to type 'double'

The second works but it changes the values. For instance 65322.09 becomes 259 , I don't know for whatever reason. First time this happens and not for any .csv files. Some just work fine.

Upvotes: 0

Views: 407

Answers (3)

Bob
Bob

Reputation: 516

If the .csv file contains NA, as for instance in the form ..., the read.csv function must include read.csv("readThis.csv", na.string="..."). This will preserve the numeric values in the .csv file. Otherwise, they will be switched to non numeric.

Upvotes: 0

Se&#241;or O
Se&#241;or O

Reputation: 17412

To convert a column to numeric you can run:

df[,5] <- as.numeric(df[,5])

However, if that column is a factor, it will lead to undesired results (see help("factor")). So if it's a factor column, the most straightforward approach is to convert it to character first, then to numeric:

df[,5] <- as.numeric(as.character(df[,5]))

Upvotes: 1

stanekam
stanekam

Reputation: 4030

In your read.csv function include this read.csv("readThis.csv", stringsAsFactors=FALSE). Also read the information in the comments. You definitely should work up your knowledge stat.

Upvotes: 2

Related Questions