PJP
PJP

Reputation: 642

how to prevent dataframe columns being classed as character instead of numeric

Good morning.

I am cycling through some data, building a dataframe as I go. Every time I add or replace a row in the dataframe, numeric values get classed as character, and I have to re-class them. I assume I am doing something wrong when adding the data to the dataframe ?

test.df<-data.frame(SIDE=rep("",5),n=rep(NA, 5),c1=rep(NA,5),stringsAsFactors=FALSE)
test.df[1,]<-cbind("A",1,256)
test.df[2,]<-cbind("A",2,258)
test.df[3,]<-cbind("A",3,350)
test.df[4,]<-cbind("A",4,400)
test.df[5,]<-cbind("A",5,360)
summary(test.df)
 SIDE                n                  c1           
  Length:5           Length:5           Length:5          
  Class :character   Class :character   Class :character  
  Mode  :character   Mode  :character   Mode  :character  

Convert the numeric columns to numeric:

test.df[, c(2:3)] <- sapply(test.df[, c(2:3)], as.numeric)
summary(test.df)
 SIDE                 n           c1       
 Length:5           Min.   :1   Min.   :256.0  
 Class :character   1st Qu.:2   1st Qu.:258.0  
 Mode  :character   Median :3   Median :350.0  
                    Mean   :3   Mean   :324.8  
                    3rd Qu.:4   3rd Qu.:360.0  
                    Max.   :5   Max.   :400.0  

So the dataframe is now as I expect it - 1 column of character data and 2 of numeric. However if I change one of the rows again:

test.df[5,]<-cbind("A",5,360)
summary(test.df)
 SIDE                n                  c1           
 Length:5           Length:5           Length:5          
 Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character  

it has gone back to all character!

Is there any way to ensure that when I append/change data in the dataframe that it keeps the appropriate classes ?

Thanks, Pete

Upvotes: 3

Views: 5222

Answers (3)

USER_1
USER_1

Reputation: 2469

Just had a similar problem, and the quickest way (I think) is to set options(stringsAsFactors=FALSE)

Upvotes: 1

IRTFM
IRTFM

Reputation: 263352

When you form a matrix, it is all of the same mode, so this cbind("A",1,256) is all character mode. (There is a cbind.data.frame function but none of the arguments to cbind were data.frames, so it was not called. You could have done this:

test.df<-data.frame(SIDE="A",n=1,c1=256,stringsAsFactors=FALSE)
test.df<- rbind( test.df,
                  list("A",2,258),
                  list("A",3,350),
                  list("A",4,400),
                  list("A",5,360) )
 test.df
#---------------    
  SIDE n  c1
1    A 1 256
2    A 2 258
3    A 3 350
4    A 4 400
5    A 5 360

Upvotes: 4

Roland
Roland

Reputation: 132706

cbind("A",5,360) is a matrix, which can hold only one type, i.e. character in your case.

Use the data.frame method:

cbind.data.frame("A",5,360)

However, "cycling through some data" is probably the least efficient way to do this in R.

Upvotes: 6

Related Questions