pachadotdev
pachadotdev

Reputation: 3765

Remove dataframes from list of dataframes using loop

I want to remove parts from a list to reduce the list to the elements of it that have a certain number of columns.

This a dummy example of what I'm trying to do:

    #1: define the list
    tables = list(mtcars,iris)

    for(k in 1:length(tables)) {
      # 2: be sure that each element is shaped as dataframe and not matrix
      tables[[k]] = as.data.frame(tables[[k]])
      # 3: remove elements that have more or less than 5 columns
      if(ncol(tables[[k]]) != 5) {
        tables <- tables[-k]
      }
    }

another option I tried:

    #1: define the list
    tables = list(mtcars,iris)

    for(k in 1:length(tables)) {
      # 2: be sure that each element is shaped as dataframe
      tables[[k]] = as.data.frame(tables[[k]])
      # 3: remove elements that have more or less than 5 columns
      if(ncol(tables[[k]]) != 5) {
        tables[[-k]] <- NULL
      }
    }

I'm getting

Error in tables[[k]] : subscript out of bounds.

Is there an alternative and correct approach?

Upvotes: 3

Views: 1772

Answers (2)

austensen
austensen

Reputation: 3007

For a tidyverse option you can use purrr:keep for this. You just define a predicate function, if true it keeps the list element, if false it removes it. Here I've done that with the formula option.


library(purrr)

tables <- list(mtcars, iris)

result <- purrr::keep(tables, ~ ncol(.x) == 5)

str(result)

#> List of 1
#>  $ :'data.frame':    150 obs. of  5 variables:
#>   ..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#>   ..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#>   ..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#>   ..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#>   ..$ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

Upvotes: 1

akrun
akrun

Reputation: 886968

We can use Filter

Filter(function(x) ncol(x)==5, tables)

Or with sapply to create a logical index and subset the list

tables[sapply(tables, ncol)==5]

Or as @Sotos commented

tables[lengths(tables)==5]

lengths return the length of each list element convert it a logical vector and subset the list. The length of a data.frame is the number of columns it has

Upvotes: 3

Related Questions