Reputation: 5300
I have a data frame called 'foo':
foo <- data.frame("row1" = c(1,2,3,4,5), "row2" = c(1,2.01,3,"-","-"))
'foo' was uploaded from a different program as a CSV file and has two columns. one is a numerical data type and the other is a factor data type.
str(foo)
'data.frame': 5 obs. of 2 variables:
$ row1: num 1 2 3 4 5
$ row2: Factor w/ 4 levels "-","1","2.01",..: 2 3 4 1 1
Notice there are dashes, e.g. "-" , in foo$row2, which causes this column to be a factor. I want to replace the dashes with zeros, such that data.class(foo$row2) will return 'numerical'. The idea is to replace all dashes in each column so I can run numberical analyses on it with R.
What is the simplest way to do this in R?
Thanks,
Upvotes: 0
Views: 5856
Reputation: 15458
Q: The idea is to replace all dashes in each column so I can run numerical analyses on it with R.
Use apply
or sapply
with sub
kk<-data.frame(apply(foo,2,function(x) as.numeric(sub("-",0,x))))
> kk
row1 row2
1 1 1.00
2 2 2.01
3 3 3.00
4 4 0.00
5 5 0.00
> str(kk$row2)
num [1:5] 1 2.01 3 0 0
Or, you can use sapply
kk<-data.frame(sapply(names(foo),function(x)as.numeric(sub("-",0,foo[,x]))))
Update:
If you want just the second col, you don't need to use apply
:foo$row2<- as.numeric(sub("-",0,foo[,2]))
Upvotes: 2
Reputation: 60000
How about gsub
...
as.numeric( gsub("-" , 0 , foo[,2] ) )
#[1] 1.00 2.01 3.00 0.00 0.00
Upvotes: 1
Reputation: 15163
Here is one simple way to do it. There might be a more elegant way, but this will work:
> foo <- data.frame("row1" = c(1,2,3,4,5), "row2" = c(1,2.01,3,"-","-"))
> levels(foo$row2)[levels(foo$row2)=="-"]<-0
> foo$row2<-as.numeric(as.character(foo$row2))
> class(foo$row2)
[1] "numeric"
> foo
row1 row2
1 1 1.00
2 2 2.01
3 3 3.00
4 4 0.00
5 5 0.00
Upvotes: 2
Reputation: 7469
I would use ifelse()
for this:
foo$row2 <- ifelse(foo$row2 == "-", 0, as.numeric(foo$row2))
you might also need to as as.character()
to convert from factor to character
Upvotes: 1