Rivka
Rivka

Reputation: 307

R applying an ifelse statement to every cell of a data.frame

I edited this question(hopefully as requested)

I need to check every cell of a data.frame, if it's value is in certain range. I am very new to apply and need to work on understanding it.

I have 2 data.frames:

Attached is a minimal example for explanation:

so far I got this, but it's calculating the same result for every cell.

c0 <- c(0,0,0,0)
c1 <- c(1,2,3,4)
c2 <- c(5,6,7,8)
c3 <- c(9,10,11,12)
c4 <- c(13,14,15,16)

blood_df <- data.frame(c0,c1,c2,c3,c4)
stat_df <- data.frame(matrix(ncol = 5, nrow = 6))
colnames(stat_df) <- colnames(blood_df)
rownames(stat_df) <- c("Mean","3*sd","sum", "Mean2","-3*sd","sum2" )

stat_df[1,2:5] <-apply(blood_df[,2:5], 2,  mean, na.rm = TRUE)
stat_df[2,2:5] <-apply(blood_df[1:4,2:5], 2, function(x)  3*sd(x,na.rm=TRUE))
stat_df[3,] <-colSums(stat_df[1:2,])
stat_df[4,2:5] <-apply(blood_df[,2:5], 2,  mean, na.rm = TRUE)
stat_df[5,2:5] <-apply(blood_df[1:4,2:5], 2, function(x) -3*sd(x,na.rm=TRUE))
stat_df[6,] <-colSums(stat_df[4:5,])

blood_df:
##   c0 c1 c2 c3 c4
## 1  0  1  5  9 13
## 2  0  2  6 10 14
## 3  0  3  7 11 15
## 4  0  4  8 12 16

stat_df:
##       c0        c1        c2        c3        c4
## Mean  NA  2.500000  6.500000 10.500000 14.500000
## 3*sd  NA  3.872983  3.872983  3.872983  3.872983
## sum   NA  6.372983 10.372983 14.372983 18.372983
## Mean2 NA  2.500000  6.500000 10.500000 14.500000
## -3*sd NA -3.872983 -3.872983 -3.872983 -3.872983
## sum2  NA -1.372983  2.627017  6.627017 10.627017 

The part that is not working as I need it:

blood_df[1:4,2:5] <- apply(blood_df[,2:5],2,  function(x) 
                   (ifelse((x > (stat_df[3,2:5]))|| 
                   (x < (stat_df[6,2:5])), NA, x)))

So far it gives me:

blood_df:
##   c0 c1 c2 c3 c4
## 1  0  1  1  1  1
## 2  0  5  5  5  5
## 3  0 NA NA NA NA
## 4  0 NA NA NA NA

What I'd like to get is:(to check if every value is in between a certain range)

blood_df:
##   c0 c1 c2 c3 c4
## 1  0  1  5  9 13
## 2  0  2  6 10 14
## 3  0  3  7 11 15
## 4  0  4  8 12 16

If it's not in the range, the value should change to NA.

Thanks!

Upvotes: 0

Views: 699

Answers (1)

Gregory Demin
Gregory Demin

Reputation: 4836

Try mapply:

column_range = 2:5
blood_df[, column_range] = mapply(function(blood, stat){
        ifelse((blood > stat[3]) | (blood < stat[6]), NA, blood)
    },
    blood_df[, column_range],
    stat_df[, column_range],
    SIMPLIFY = FALSE
)

Upvotes: 1

Related Questions