AME
AME

Reputation: 5300

Replace strings in data frame columns with integer in R

I have a data frame called 'foo':

 foo <- data.frame("row1" = c(1,2,3,4,5), "row2" = c(1,2.01,3,"-","-"))

'foo' was uploaded from a different program as a CSV file and has two columns. one is a numerical data type and the other is a factor data type.

str(foo)
'data.frame':   5 obs. of  2 variables:
$ row1: num  1 2 3 4 5
$ row2: Factor w/ 4 levels "-","1","2.01",..: 2 3 4 1 1

Notice there are dashes, e.g. "-" , in foo$row2, which causes this column to be a factor. I want to replace the dashes with zeros, such that data.class(foo$row2) will return 'numerical'. The idea is to replace all dashes in each column so I can run numberical analyses on it with R.

What is the simplest way to do this in R?

Thanks,

Upvotes: 0

Views: 5856

Answers (4)

Metrics
Metrics

Reputation: 15458

Q: The idea is to replace all dashes in each column so I can run numerical analyses on it with R.

Use apply or sapply with sub

 kk<-data.frame(apply(foo,2,function(x) as.numeric(sub("-",0,x))))
> kk
  row1 row2
1    1 1.00
2    2 2.01
3    3 3.00
4    4 0.00
5    5 0.00

> str(kk$row2)
 num [1:5] 1 2.01 3 0 0

Or, you can use sapply

kk<-data.frame(sapply(names(foo),function(x)as.numeric(sub("-",0,foo[,x]))))

Update: If you want just the second col, you don't need to use apply:foo$row2<- as.numeric(sub("-",0,foo[,2]))

Upvotes: 2

Simon O&#39;Hanlon
Simon O&#39;Hanlon

Reputation: 60000

How about gsub...

as.numeric( gsub("-" , 0 , foo[,2] ) )
#[1] 1.00 2.01 3.00 0.00 0.00

Upvotes: 1

mrip
mrip

Reputation: 15163

Here is one simple way to do it. There might be a more elegant way, but this will work:

> foo <- data.frame("row1" = c(1,2,3,4,5), "row2" = c(1,2.01,3,"-","-"))
> levels(foo$row2)[levels(foo$row2)=="-"]<-0
> foo$row2<-as.numeric(as.character(foo$row2))
> class(foo$row2)
[1] "numeric"
> foo
  row1 row2
1    1 1.00
2    2 2.01
3    3 3.00
4    4 0.00
5    5 0.00

Upvotes: 2

Stedy
Stedy

Reputation: 7469

I would use ifelse() for this: foo$row2 <- ifelse(foo$row2 == "-", 0, as.numeric(foo$row2))

you might also need to as as.character() to convert from factor to character

Upvotes: 1

Related Questions