Reputation: 127
I was doing some calculation in R and was confused by the logic R uses.
For example,
table <- data.frame(a = c(1,NA,2,1), b= c(1,1,3,2))
Here, I am going to create the third column "c"
Column c will be 0 if column a contains NA. Otherwise it will be addition of column a and column b.
So the column c should be
c(2,0,5,3)
I wrote:
table$c <- 0
table$c[!is.na(table$a)] <- table$a + table$b
And I have column c as
c(2,0,NA,5)
I see that
table$c[3] = table$a[2]+table$b[2]
when I wanted it to be table$c[3] = table$a[3] + table$b[3].
I thought R would skip index number 2 in the left and right side and jump to index 3 in the calculation, but in fact, R skipped index number 2 in the left but didn't skip number 2 in the right side...
Why does this happen? How should I prevent this?
Thank you.
Upvotes: 0
Views: 342
Reputation: 1795
Alternatively, you could make use of the data.table
package
library(data.table)
table <- data.table(a = c(1,NA,2,1), b= c(1,1,3,2))#creates the data table structure
table[,c:=ifelse(is.na(a),0,a+b)]#creates the column c based on the condition
> table
a b c
1: 1 1 2
2: NA 1 0
3: 2 3 5
4: 1 2 3
Upvotes: 0
Reputation: 541
Use
table$c <- apply(table, 1, sum)
table$c[is.na(table$c)] <- 0
Or even more simple if you only start learning R:
table$c <- table$a + table$b
table$c[is.na(table$c)] <- 0
In order to prevent things like in your case, don't ask R to do two things at the same time like here:
table$c[!is.na(table$a)] <- table$a + table$b
You basically asked R to check if c contains NA 'on the fly', and it's not how R is working.
Upvotes: 2