Alejandro Navarrete
Alejandro Navarrete

Reputation: 3

R - Looping over indexes from data frame

I'm trying to loop over a group of indexes from a data frame.
The data frame only has 1 column.

X  
1  
2  
3  
... 

Considering the following variable vars which holds some indexes from the data frame:

   $1  
   [1]  1 28

   $2  
   [1] 29 61

I'm trying to loop over each of those and apply a function to each value in the data frame.
For example, I'm trying to loop over indexes 1 through 28, then apply a function, then loop over indexes 29 through 61, apply a different function and so on...
This is what I've tried.

z = list()
for (i in 1:length(vars)) { 
z[[i]] <- i
   for (j in vars[[i]][1]:vars[[i]][2]) {
   z[[i]][j] <- j
   }
}

Before applying the function to the data frame, and everything else. I would first just like to see if I'm getting the right indexes, but this is what I got.

[[1]]  
   [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 

   [[2]]  
 [1]  2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 29 30 31 32 33 34 35  
[36] 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

Which is not what I was expecting. The first list is fine, but can't tell what's happening with the second list.

Upvotes: 0

Views: 4701

Answers (1)

r2evans
r2evans

Reputation: 160407

This may be better handled with lapply:

df <- data.frame(x=1:100)
vars <- list(c(1,28), c(29,61))
str(lapply(vars, function(i) df$x[ i[1]:i[2] ]))
# List of 2
#  $ : int [1:28] 1 2 3 4 5 6 7 8 9 10 ...
#  $ : int [1:33] 29 30 31 32 33 34 35 36 37 38 ...

(The use of str was to shorten this display.)

If you want to apply some arbitrary function to each value within the range 1:28 (for example), do something else inside the function. For example:

func <- function(ab, x) { mean(x[ ab[1]:ab[2] ]); }
str(lapply(vars, func, df$x))
# List of 2
#  $ : num 14.5
#  $ : num 45

Here, func is a contrived arbitrary function that takes two arguments: a length-2 vector of index ends (i.e., c(1,28)), and the vector of values.

Notes about this example function:

  1. I put the ab argument (indices) first intentionally, to facilitate the shorter notation within lapply. Note that lapply(vars, func, df$x) is expanded into lapply(vars, function(a) func(a, df$x)), so I think it's a little more readable above. If the arguments within func were reversed, you could not use the abbreviated format, instead requiring lapply(vars, function(a) func(df$x, a)).

  2. There may be better ways to take the mean of that range; this is a trivial example to show how you could extend it.

Upvotes: 1

Related Questions