indexing through function to gather multiple values

Question

I want to extract a sequence of values from this dataframe altogether to form a single output, I can do so individually using:

p <- function(x, i){
 r<- rank_data[x][rank_data[,i] %in% 2000:2020,]
 }

p(1:2, 2)
#output
jan    year
235.2  2008

where the sequence of values in p() continue as:

  x=c(1:2, 3:4, 5:6)
  i=(2, 4, 6)

I'm looking for a single code, where a variable x or i can be indexed into the dataframe to produce the expected output. Although, using some other iteration function like apply is welcome. I want to better understand indexing through iterative functions.

expected output:

jan   year feb    year2  mar    year3        ...
235.2 2008 287.6  2020   187.8  2019         ...
NA     NA  241.9  2002    NA     NA

I've asked a similar question here, although, I'm more interested in doing this through indexing with a single iterative function. The technique provided by the author in the previous question, is very specialised, so I'm looking for something simpler to get the hang of this.

Reproducible code:

structure(list(jan = c(268.1, 263.1, 235.2, 223.3, 219.2, 218.3
), year = c(1928, 1948, 2008, 1877, 1995, 1990), feb = c(287.6, 
241.9, 213.7, 205.1, 191.9, 191.2), year2 = c(2020, 2002, 1997, 
1990, 1958, 1923), mar = c(225.3, 190.7, 187.8, 187.2, 175.9, 
173.9), year3 = c(1981, 1903, 2019, 1947, 1994, 1912)), class = "data.frame", row.names = c(NA, 
6L))

Ronak Shah · Accepted Answer

You should store the values of x in a list because if you store them in a vector there is no way to distinguish between two groups.

x = c(1:2, 3:4, 5:6)
x
#[1] 1 2 3 4 5 6

Storing them in a list.

x= list(1:2, 3:4, 5:6)
x
#[[1]]
#[1] 1 2

#[[2]]
#[1] 3 4

#[[3]]
#[1] 5 6

You can use Map to index rows from your dataframe.

p <- function(x, i){
  r<- rank_data[x][rank_data[,i] %in% 2000:2020,]
  r
}


x= list(1:2, 3:4, 5:6)
i= c(2, 4, 6)
result <- Map(p, x, i)
result

#[[1]]
#    jan year
#3 235.2 2008

#[[2]]
#    feb year2
#1 287.6  2020
#2 241.9  2002

#[[3]]
#    mar year3
#3 187.8  2019

If you want output same as shown you can add another step to count max number of rows.

nr <- 1:max(sapply(result, nrow))
do.call(cbind, lapply(result, function(x) x[nr, ]))

#     jan year   feb year2   mar year3
#3  235.2 2008 287.6  2020 187.8  2019
#NA    NA   NA 241.9  2002    NA    NA

indexing through function to gather multiple values

Answers (1)

Related Questions