ITs Me
ITs Me

Reputation: 33

Applying an apply function over a list

I have a list with 535 elements where each of these elements is a 1575x1575 matrix. Some of the rows and columns are however entirely NAs. I want to remove these rows and columns and already wrote a line which works when I just apply it for one entry. But I can't figure out how to apply this apply function for the whole list. covmatrix is my list in this example.

testf <- function(i){
  covmatrix[[i]][apply(!is.na(covmatrix[[i]]),2,any),apply(!is.na(covmatrix[[i]]),2,any)]
}

newlist <- lapply(covmatrix, testf)

I get the error code: Error in covmatrix[[i]] : no such Index at Level 1 I guess I do not understand properly how lapply works.

Upvotes: 0

Views: 2857

Answers (2)

ekoam
ekoam

Reputation: 8844

Assume that your list of matrices looks like this

set.seed(100)
ls_of_mat <- replicate(5, matrix(sample(c(NA, 1:10), size = 36, T, c(.7, rep(.3 / 10, 10))), 6), F)
[[1]]
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]   NA    5   NA   NA   NA   NA
[2,]   NA   NA   NA   NA   NA    9
[3,]   NA   NA    4   NA    4   NA
[4,]   NA   NA   NA    2    8   10
[5,]   NA   NA   NA   NA   NA   NA
[6,]   NA    8   NA    7   NA    8

[[2]]
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]   NA    4   NA   NA   NA   NA
[2,]   NA    6   NA   NA   NA   NA
[3,]    1   NA   NA   NA   10   NA
[4,]   NA   NA   NA   NA   NA   NA
[5,]   NA    4   NA   NA   NA   NA
[6,]    3    8   NA   NA   NA   NA

[[3]]
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]   NA    6   NA    8   NA   NA
[2,]   10   NA   NA   NA   NA   NA
[3,]   NA   NA    7   NA   NA   NA
[4,]   NA   NA   NA   NA    4   NA
[5,]    3    9   NA    8   NA    1
[6,]    4    1    7   NA   NA    2

Your logic simplifies to

# 1. find non-NA elements
# 2. drop rows and cols with less than one (zero) non-NA element

lapply(ls_of_mat, function(x) {
  is_value <- !is.na(x)
  x[!rowSums(is_value) < 1L, !colSums(is_value) < 1L]
})

Output

[[1]]
     [,1] [,2] [,3] [,4] [,5]
[1,]    5   NA   NA   NA   NA
[2,]   NA   NA   NA   NA    9
[3,]   NA    4   NA    4   NA
[4,]   NA   NA    2    8   10
[5,]    8   NA    7   NA    8

[[2]]
     [,1] [,2] [,3]
[1,]   NA    4   NA
[2,]   NA    6   NA
[3,]    1   NA   10
[4,]   NA    4   NA
[5,]    3    8   NA

[[3]]
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]   NA    6   NA    8   NA   NA
[2,]   10   NA   NA   NA   NA   NA
[3,]   NA   NA    7   NA   NA   NA
[4,]   NA   NA   NA   NA    4   NA
[5,]    3    9   NA    8   NA    1
[6,]    4    1    7   NA   NA    2

Upvotes: 1

Allan Cameron
Allan Cameron

Reputation: 173793

Lets' take the following toy example data:

matlist <- lapply(1:3, function(x) matrix(1:9, ncol = 3))
matlist[[2]][1,] <- NA
matlist[[3]][,1] <- NA
matlist
#> [[1]]
#>      [,1] [,2] [,3]
#> [1,]    1    4    7
#> [2,]    2    5    8
#> [3,]    3    6    9
#> 
#> [[2]]
#>      [,1] [,2] [,3]
#> [1,]   NA   NA   NA
#> [2,]    2    5    8
#> [3,]    3    6    9
#> 
#> [[3]]
#>      [,1] [,2] [,3]
#> [1,]   NA    4    7
#> [2,]   NA    5    8
#> [3,]   NA    6    9

It makes coding a lot easier if we break down the problem into little chunks. For a complex problem, clarity of code is more important than brevity.

First we need a function that will return FALSE if all elements of a vector are NA, and TRUE otherwise:

notallNA  <- function(vector) !all(is.na(vector))

Now we write a second function that uses our first function to remove rows and columns that consist purely of NAs from a matrix:

remove_NA <- function(mat) {
  
  valid_rows <- apply(mat, 1, notallNA)
  valid_cols <- apply(mat, 2, notallNA)
  
  return(mat[valid_rows, valid_cols])
}

Finally, we can lapply this function to our list of matrices:

lapply(matlist, remove_NA)
#> [[1]]
#>      [,1] [,2] [,3]
#> [1,]    1    4    7
#> [2,]    2    5    8
#> [3,]    3    6    9
#> 
#> [[2]]
#>      [,1] [,2] [,3]
#> [1,]    2    5    8
#> [2,]    3    6    9
#> 
#> [[3]]
#>      [,1] [,2]
#> [1,]    4    7
#> [2,]    5    8
#> [3,]    6    9

Note that, although we could squash these two functions into one or two lines of code, and do the whole thing as a lambda inside an lapply, the above code is simpler and easier to read / maintain than:

lapply(matlist, function(x) x[apply(x, 1, function(y) !all(is.na(y))), 
                              apply(x, 2, function(y) !all(is.na(y)))])
#> [[1]]
#>      [,1] [,2] [,3]
#> [1,]    1    4    7
#> [2,]    2    5    8
#> [3,]    3    6    9
#> 
#> [[2]]
#>      [,1] [,2] [,3]
#> [1,]    2    5    8
#> [2,]    3    6    9
#> 
#> [[3]]
#>      [,1] [,2]
#> [1,]    4    7
#> [2,]    5    8
#> [3,]    6    9

Upvotes: 2

Related Questions