D M
D M

Reputation: 81

Extracting data from lists of different levels using a function

Good morning all.

I have a function that uses as a parameter a list to produce different metrics by extracting an element into a df. However, the lists I intend to use are of different hierarchical levels and, therefore, I can't use it on all of them.

I am giving an example (not the real function), below:

# the function pulls out a df from a list
df_func <- function(list){
  
  df_temp <- list[[1]]
  
  return(df_temp)  
  
}

data("iris") 

list_a <- list("list_a" = iris, "list_b" = iris,
                     "list_c" = iris, "list_d" = iris)

list_b <- list()
list_b[["list_a"]] <- list_a
list_b[["list_b"]] <- list_a
list_b[["list_c"]] <- list_a
list_b[["list_d"]] <- list_a

df1 <- df_func(list_a) # correct (returns a df)
df2 <- df_func(list_b) # wrong (returns a list of dfs)

I know that the problem lies in the fact that to access the correct element from list_a we use list_a[[1]] where as to extract the correct element from list_b we have to use list_b[[1]][[1]].

The question I have, is how to code this in the function so that R knows where to look for the df I require?

Thank you all for helping a newbie.

Upvotes: 2

Views: 523

Answers (3)

akrun
akrun

Reputation: 887691

Consider using already available recursive functions i.e. rapply/rrapply

df_func <- function(listObj) {
     rrapply::rrapply(listObj, classes = "data.frame", how = 'flatten')[[1]]
   }

-testing

> out1 <- df_func(list_a)
> out2 <- df_func(list_b)
> str(out1)
'data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
> str(out2)
'data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

Upvotes: 2

LC-datascientist
LC-datascientist

Reputation: 2096

If you have a list of lists (a nested list), you can specify a second parameter (and third, fourth, etc.) in the function to take within the nested list.

df_func <- function(list, list2, index = 1){
    df_temp <- list[[list2]][[index]] # default index = 1
    return(df_temp)  
}

df_func(list_b, 1) # outputs `list_b[[1]][[1]]`
df_func(list_b, 2) # outputs `list_b[[2]][[1]]`
df_func(list_b, 2, 2) # outputs `list_b[[2]][[2]]`

It will not work with list_a because it is not a nested list (nor the same level of nested list). You will also need more parameters for higher level nested lists.

Here is a more advanced way that is not restricted by any specific levels of nesting:

df_func2 <- function(list, index = 1) {
    l <- paste0("[[",index,"]]", collapse="")
    l2 <- paste0(deparse(substitute(list)),l)
    df_temp <- eval(parse(text=l2))
    return(df_temp)
}

df_func2(list_a) # outputs `list_a[[1]]`
df_func2(list_a, 2) # outputs `list_a[[2]]`
df_func2(list_b, 1) # outputs `list_b[[1]]` (list of data frames)
df_func2(list_b, c(1, 1)) # outputs `list_b[[1]][[1]]`
df_func2(list_b, c(1, 2)) # outputs `list_b[[1]][[2]]`

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389175

How about using a recursive function ?

df_func <- function(list){
  tmp <- list[[1]]
  if(class(tmp) == 'list') {
    df_func(tmp)
  } else tmp
}

Upvotes: 0

Related Questions