Delete empty dataframes from List

Have this following dataframe's List which I'm working with inside a for Loop. Here I have two empty dataframes, x[[2]] and x[[4]]. How can I remove then inside my for loop, let's say, if I do:

for (i in 1:length(x)){
 ****x[[i]] = ***remove empty dataframes in x[[i]]***
}

DATASET:

    x <- list(structure(list(`8935364000175` = c(0.0060428512369981, 0.00603116477577714, 
0.00601948031544453, 0.00584680183237651, 0.00588356233492959, 
0.00604482211201685, 0.00595391284150537, 0.00612897711107507, 
0, 0.00608207737968769), `25079578000106` = c(0.0561319890039158, 
0.0713528263077023, -0.310352321776008, 0.244770088829682, 0.0361304175385158, 
-0.215327063233417, -0.0463209246845508, 0, 0.0781647175244871, 
0.0306871725115343)), row.names = c("Retorno D - 260", "Retorno D - 259", 
"Retorno D - 258", "Retorno D - 257", "Retorno D - 256", "Retorno D - 255", 
"Retorno D - 254", "Retorno D - 253", "Retorno D - 252", "Retorno D - 251"
), class = "data.frame"), structure(list(), row.names = c("Retorno D - 260", 
"Retorno D - 259", "Retorno D - 258", "Retorno D - 257", "Retorno D - 256", 
"Retorno D - 255", "Retorno D - 254", "Retorno D - 253", "Retorno D - 252", 
"Retorno D - 251"), class = "data.frame"), structure(list(`25079578000106` = c(-0.0284871479379945, 
0.141976900522423, -0.115634388475883, 0.0858369759953348, 0.252102295598888, 
-0.130994651044603, 0.213179273123387, 0, 0.254748840234242, 
-0.162688137697842), `19107923000175` = c(-2.542040795106, -1.30722252988562, 
0.166101507966232, -0.333577277251607, -0.48391700402135, -0.287302340893802, 
0.276978237343428, 0, -2.20114477424431, 2.28636453339277), `15674503000110` = c(0.151917711446004, 
0.27261553095741, -0.0217761778003478, 0.0357082184564206, 0, 
-0.430589756888367, 0.0980497330601793, 0.162832113528566, 0.135437075368827, 
0.0254803373536561), `19391009000107` = c(0.118515970461885, 
0.0201793494852609, 0.05212900123297, -0.122335026844667, 0, 
-0.173768502372695, -0.146583881632978, -0.102553665146843, -0.161486374236119, 
0.522601667762501), `26111809000184` = c(0.188357122169691, 0.597206398924754, 
-0.117262560343079, -0.350788641299005, 0, -0.427825340193522, 
0.0359309879058856, 0.144902896136045, 0.43725947070925, 0.0456868876426597
), `32666326000149` = c(1.84565666459093, 3.33521612974437, -0.706821796120494, 
-1.41998375802359, 0, -1.00702592444577, -0.764259576953918, 
0.504494091364904, 2.34908743768756, 1.12513038984616)), row.names = c("Retorno D - 260", 
"Retorno D - 259", "Retorno D - 258", "Retorno D - 257", "Retorno D - 256", 
"Retorno D - 255", "Retorno D - 254", "Retorno D - 253", "Retorno D - 252", 
"Retorno D - 251"), class = "data.frame"), structure(list(), row.names = c("Retorno D - 260", 
"Retorno D - 259", "Retorno D - 258", "Retorno D - 257", "Retorno D - 256", 
"Retorno D - 255", "Retorno D - 254", "Retorno D - 253", "Retorno D - 252", 
"Retorno D - 251"), class = "data.frame"))

Upvotes: 1

Views: 478

Answers (3)

GuedesBF
GuedesBF

Reputation: 9858

I like to use purrr::keep and purrr:::discard in such situations, instead of for loops. The following code discards all list elements with either ncol==0 or nrow==0 ('empty' data.frames):

library(purrr)

new_x <- discard(x, ~ncol(.x)==0 || nrow(.x)==0)

# OR with `length(as.matrix)`:

new_x <- discard(x, ~length(as.matrix(x))==0)

str(new_x)
List of 2
 $ :'data.frame':   10 obs. of  2 variables:
  ..$ 8935364000175 : num [1:10] 0.00604 0.00603 0.00602 0.00585 0.00588 ...
  ..$ 25079578000106: num [1:10] 0.0561 0.0714 -0.3104 0.2448 0.0361 ...
 $ :'data.frame':   10 obs. of  6 variables:
  ..$ 25079578000106: num [1:10] -0.0285 0.142 -0.1156 0.0858 0.2521 ...
  ..$ 19107923000175: num [1:10] -2.542 -1.307 0.166 -0.334 -0.484 ...
  ..$ 15674503000110: num [1:10] 0.1519 0.2726 -0.0218 0.0357 0 ...
  ..$ 19391009000107: num [1:10] 0.1185 0.0202 0.0521 -0.1223 0 ...
  ..$ 26111809000184: num [1:10] 0.188 0.597 -0.117 -0.351 0 ...
  ..$ 32666326000149: num [1:10] 1.846 3.335 -0.707 -1.42 0 ...

Upvotes: 3

Ben Bolker
Ben Bolker

Reputation: 226097

Simpler test data:

x <- list(data.frame(x=1:3), data.frame(x=numeric(0)), data.frame(x=1:4))

I would recommend (without a for loop):

is_empty <- function(x) (nrow(x)==0 || ncol(x) ==0)
x <- x[sapply(x, is_empty)] 

? This creates a logical vector which is TRUE if the data frame is empty, and subsets the original list accordingly.

Setting an element of a list to NULL (x[[i]] <- NULL) removes it from the list, but I would worry that your indexing is going to get screwed up. It's very tricky to get this right, because the loop changes under you. For example, consider

x <- list(data.frame(x=1:3), data.frame(x=numeric(0)),
          data.frame(x=numeric(0)))
for (i in 1:length(x)) {
   if (nrow(x[[i]])==0) x[[i]] <- NULL
}

This gets "error in x[[i]]: subscript out of bounds", because

  1. i==1, check x[[1]]: OK, move on to next element
  2. i==2; remove the second element. Now we have a list with only two elements (the previous first and third elements)
  3. i==3; error, because x[[3]] doesn't exist!

You could do this to avoid messing up the indexing:

j <- 0  ## cumulative number removed
for (i in seq(length(x))) {
  if (is_empty(x[[i-j]])) {
     x[[i-j]] <- NULL
     j <- j + 1
  }
}

but this seems like a "code smell", i.e. you're having to work extra hard because you're doing it in an awkward way.

Upvotes: 2

Acturio
Acturio

Reputation: 74

for (i in 1:length(x)) {
  if ( ncol(x[[i]]) == 0) {
    x[[i]] <- NULL
  }
}
x

Ther is no problem if you see: "Error in x[[i]] : subscript out of bounds.". The problem is solved.

Upvotes: 0

Related Questions