user1830307
user1830307

Reputation:

Modifying for loop with if conditions to apply format in R

I am creating a variable called indexPoints that contains a subset of index values that passed certain conditions -

set.seed(1)
x = abs(rnorm(100,1))
y = abs(rnorm(100,1))
threshFC = 0.5

indexPoints=c()
seqVec = seq(1, length(x))
for (i in seq_along(seqVec)){
    fract = x[i]/y[I]
    fract[1] = NaN
    if (!is.nan(fract)){
        if(fract > (threshFC + 1) || fract < (1/(threshFC+1))){
            indexPoints = c(indexPoints, i)
        }
    }
}

I am trying to recreate indexPoints using a more efficient method like apply methods (any except sapply). I started the process as shown below -

set.seed(1)
x = abs(rnorm(100,1))
y = abs(rnorm(100,1))
threshFC = 0.5

seqVec <- seq_along(x)
fract = x[seqVec]/y[seqVec]
fract[1] = NaN
vapply(fract, function(i){
    if (!is.nan(fract)){ if(fract > (threshFC + 1) || fract < (1/(threshFC+1))){ i}}
}, character(1))

However, this attempt causes an ERROR:

Error in vapply(fract, function(i) { : values must be length 1,
 but FUN(X[[1]]) result is length 0

How can I continue to modify the code to make it in an apply format. Note: sometimes, the fract variable contains NaN values, which I mimicked for the minimum examples above by using "fract[1] = NaN".

Upvotes: 2

Views: 58

Answers (1)

r2evans
r2evans

Reputation: 160677

There are several problems with your code:

  1. You tell vapply that you expect the internal code to return a character, yet the only thing you ever return is i which is numeric;
  2. You only explicitly return something when all conditions are met, which means if the conditions are not all good, you do not return anything ... this is the same as return(NULL) which is also not character (try vapply(1:2, function(a) return(NULL), character(1)));
  3. You explicitly set fract[1] = NaN and then test !is.nan(fract), so you will never get anything; and
  4. (Likely a typo) You reference y[I] (capital "i") which is an error unless I is defined somewhere (which is no longer a syntax error but is now a logical error).

If I fix the code (remove NaN assignment) in your for loop, I get

indexPoints
#  [1]  3  4  5  6 10 11 12 13 14 15 16 18 20 21 25 26 28 29 30 31 32 34 35 38 39
# [26] 40 42 43 44 45 47 48 49 50 52 53 54 55 56 57 58 59 60 61 64 66 68 70 71 72
# [51] 74 75 77 78 79 80 81 82 83 86 88 89 90 91 92 93 95 96 97 98 99

If we really want to do this one at a time (I recommend against it, read below), then there are a few methods:

  1. Use Filter to only return the indices where the condition is true:

    indexPoints2 <- Filter(function(i) {
      fract <- x[i] / y[i]
      !is.nan(fract) && (fract > (threshFC+1) | fract < (1/(threshFC+1)))
    }, seq_along(seqVec))
    identical(indexPoints, indexPoints2)
    # [1] TRUE
    
  2. Use vapply correctly, returning an integer either way:

    indexPoints3 <- vapply(seq_along(seqVec), function(i) {
      fract <- x[i] / y[i]
      if (!is.nan(fract) && (fract > (threshFC+1) | fract < (1/(threshFC+1)))) i else NA_integer_
    }, integer(1))
    str(indexPoints3)
    #  int [1:100] NA NA 3 4 5 6 NA NA NA 10 ...
    indexPoints3 <- indexPoints3[!is.na(indexPoints3)]
    identical(indexPoints, indexPoints3)
    # [1] TRUE
    

    (Notice the explicit return of a specific type of NA, that is NA_integer_, so that vapply is happy.)

  3. We can instead just return the logical if the index matches the conditions:

    logicalPoints4 <- vapply(seq_along(seqVec), function(i) {
      fract <- x[i] / y[i]
      !is.nan(fract) && (fract > (threshFC+1) | fract < (1/(threshFC+1)))
    }, logical(1))
    head(logicalPoints4)
    # [1] FALSE FALSE  TRUE  TRUE  TRUE  TRUE
    identical(indexPoints, which(logicalPoints4))
    # [1] TRUE
    

But really, there is absolutely no need to use vapply or any of the apply functions, since this can be easily (and much more efficiently) checked as a vector:

fract <- x/y # all at once
indexPoints5 <- which(!is.nan(fract) & (fract > (threshFC+1) | fract < (1/(threshFC+1))))
identical(indexPoints, indexPoints5)
# [1] TRUE

(If you don't use which, you'll see that it gives you a logical vector indicating if the conditions are met, similar to bullet 3 above with logicalPoints4.)

Upvotes: 6

Related Questions