Remove a row from all dataframes in a list if NA value in one of the rows

Question

I have a list of data.frames of equal size. There exist missing data in different rows and columns of each data.frame. I would like to remove the row of each data frame for which one of data.frames have a row that contains a NaN. The current lapply and na.omit code I have removes each row corresponding to the specific data.frame which makes sense as it goes through each data.frame in the list before moving on to the next one. However, I would like to make it so that if an NaN exists in one row of a data.frame that row gets removed from all other data.frames

Some example code:

#Make list
ls <- list(x1=data.frame(a=c(1,2,3,4),b=c(2,3,4,5),c=c(3,4,NaN,6)),
           x2=data.frame(a=c(1,NaN,3,4),b=c(2,3,4,5),c=c(3,4,5,6)))
#Desired output
lscalc <- list(x1=data.frame(a=c(1,4),b=c(2,5),c=c(3,6)),
               x2=data.frame(a=c(1,4),b=c(2,5),c=c(3,6)))

akrun · Accepted Answer

Assuming all the datasets have the same number of rows, get the row index from all the datasets first and then loop over the list and remove those rows

un1 <- unique(unlist(lapply(ls, function(x) which(is.na(x), arr.ind = TRUE)[,1])))
lapply(ls, function(x) x[!seq_len(nrow(x)) %in% un1, ])
$x1
  a b c
1 1 2 3
4 4 5 6

$x2
  a b c
1 1 2 3
4 4 5 6

Remove a row from all dataframes in a list if NA value in one of the rows

Answers (2)

Related Questions