Vasile
Vasile

Reputation: 1017

select dataframes from a list based on the name of a column

Say I have the following list of data frames

d1 <- data.frame(y1 = c(1, 2, 3), y2 = c(4, 5, 6), y3=c(1,2,3))
d2 <- data.frame(y1 = c(3, 2, 1), z2 = c(6, 5, 4), y3=c(1,2,3))
d3 <- data.frame(y1 = c(1, 2, 3), y2 = c(4, 5, 6), y3=c(1,2,3))
d4 <- data.frame(y1 = c(3, 2, 1), z2 = c(6, 5, 4), y3=c(1,2,3))
d5 <- data.frame(y1 = c(1, 2, 3), y2 = c(4, 5, 6), y3=c(1,2,3))

my.list <- list(d1, d2, d3, d4, d5)

I would like to separate these data frames in two lists based on the name of their second column. That is, because d2 and d4's second column has a different name from the second column of the rest, they should be in one list and the remaining three should be in another list.

Upvotes: 1

Views: 435

Answers (2)

GKi
GKi

Reputation: 39667

You can extract the second column with [ in sapply by giving the column number (here 2) and use the names to split my.list according the the name of the second column.

l2 <- split(my.list, names(sapply(my.list, `[`, 2)))
#l2 <- split(my.list, sapply(my.list, function(x) names(x)[2])) #Alternative

str(l2)
#List of 2
# $ y2:List of 3
#  ..$ :'data.frame':    3 obs. of  3 variables:
#  .. ..$ y1: num [1:3] 1 2 3
#  .. ..$ y2: num [1:3] 4 5 6
#  .. ..$ y3: num [1:3] 1 2 3
#  ..$ :'data.frame':    3 obs. of  3 variables:
#  .. ..$ y1: num [1:3] 1 2 3
#  .. ..$ y2: num [1:3] 4 5 6
#  .. ..$ y3: num [1:3] 1 2 3
#  ..$ :'data.frame':    3 obs. of  3 variables:
#  .. ..$ y1: num [1:3] 1 2 3
#  .. ..$ y2: num [1:3] 4 5 6
#  .. ..$ y3: num [1:3] 1 2 3
# $ z2:List of 2
#  ..$ :'data.frame':    3 obs. of  3 variables:
#  .. ..$ y1: num [1:3] 3 2 1
#  .. ..$ z2: num [1:3] 6 5 4
#  .. ..$ y3: num [1:3] 1 2 3
#  ..$ :'data.frame':    3 obs. of  3 variables:
#  .. ..$ y1: num [1:3] 3 2 1
#  .. ..$ z2: num [1:3] 6 5 4
#  .. ..$ y3: num [1:3] 1 2 3

Upvotes: 2

AnilGoyal
AnilGoyal

Reputation: 26218

d1 <- data.frame(y1 = c(1, 2, 3), y2 = c(4, 5, 6), y3=c(1,2,3))
d2 <- data.frame(y1 = c(3, 2, 1), z2 = c(6, 5, 4), y3=c(1,2,3))
d3 <- data.frame(y1 = c(1, 2, 3), y2 = c(4, 5, 6), y3=c(1,2,3))
d4 <- data.frame(y1 = c(3, 2, 1), z2 = c(6, 5, 4), y3=c(1,2,3))
d5 <- data.frame(y1 = c(1, 2, 3), y2 = c(4, 5, 6), y3=c(1,2,3))

my.list <- list(d1, d2, d3, d4, d5)

split(my.list, sapply(my.list, \(x) names(x)[2]))
#> $y2
#> $y2[[1]]
#>   y1 y2 y3
#> 1  1  4  1
#> 2  2  5  2
#> 3  3  6  3
#> 
#> $y2[[2]]
#>   y1 y2 y3
#> 1  1  4  1
#> 2  2  5  2
#> 3  3  6  3
#> 
#> $y2[[3]]
#>   y1 y2 y3
#> 1  1  4  1
#> 2  2  5  2
#> 3  3  6  3
#> 
#> 
#> $z2
#> $z2[[1]]
#>   y1 z2 y3
#> 1  3  6  1
#> 2  2  5  2
#> 3  1  4  3
#> 
#> $z2[[2]]
#>   y1 z2 y3
#> 1  3  6  1
#> 2  2  5  2
#> 3  1  4  3

Using list2env in end will result in saving 2 lists with names as y2 and z2 as desired

split(my.list, sapply(my.list, \(x) names(x)[2])) |>
  list2env(envir = .GlobalEnv)

Upvotes: 2

Related Questions