RobertF
RobertF

Reputation: 904

How apply a function to every element of a matrix?

I'm struggling to apply a function to each element of a matrix, a lower triangular Jaccard similarity matrix.

The function should return values of the matrix with values > .7, and reassign other elements as NA, making it easier to identify highly similar binary variables. Ideally the matrix structure is preserved.

I've created a simple sample 3x3 matrix populated with random values for testing:

N <- 3 # Observations
N_vec <- 3 # Number of vectors
set.seed(123)
x1 <- runif(N * N_vec)
mat_x1 <- matrix(x1, ncol = N_vec)
mat_x1[upper.tri(mat_x1)] <- NA
diag(mat_x1) <- NA
mat_x1
              [,1]      [,2] [,3]
    [1,]        NA        NA   NA
    [2,] 0.7883051        NA   NA
    [3,] 0.4089769 0.0455565   NA

How do I apply the following function to each matrix element that returns values > 0.7?

y = (function(x) if (x > .7) { return(x) } else { return(NA) })

I'd like to see the following after applying the function:

mat_x2
              [,1] [,2] [,3]
    [1,]        NA   NA   NA
    [2,] 0.7883051   NA   NA
    [3,]        NA   NA   NA

Upvotes: 5

Views: 163

Answers (7)

ThomasIsCoding
ThomasIsCoding

Reputation: 102299

You can use NA + ^ to create a mask

> mat_x1 * NA^(mat_x1 <= 0.7)
          [,1] [,2] [,3]
[1,]        NA   NA   NA
[2,] 0.7883051   NA   NA
[3,]        NA   NA   NA

or a much simpler one with replace

> replace(mat_x1, mat_x1 <= .7, NA)
          [,1] [,2] [,3]
[1,]        NA   NA   NA
[2,] 0.7883051   NA   NA
[3,]        NA   NA   NA

Upvotes: 1

Pexav01
Pexav01

Reputation: 51

You can apply a function to each element of a matrix using the apply() family of functions, but for a custom condition like this, vectorized operations are the most efficient in R. Here's how you can achieve the desired output:

Solution Use the vectorized conditional operator ifelse() to apply your function across all elements. Preserve the structure of the matrix. Here is the code to implement your requirement:

# Define your sample matrix
N <- 3
N_vec <- 3
set.seed(123)
x1 <- runif(N * N_vec)
mat_x1 <- matrix(x1, ncol = N_vec)
mat_x1[upper.tri(mat_x1)] <- NA
diag(mat_x1) <- NA

# Apply the function using ifelse()
mat_x2 <- ifelse(!is.na(mat_x1) & mat_x1 > 0.7, mat_x1, NA)

# Output the new matrix
mat_x2

Upvotes: -1

Rui Barradas
Rui Barradas

Reputation: 76575

Though there is an accepted answer, I think it's worth posting a is.na<- solution.
On the RHS you have an index vector giving which values are to become NA, in this case the vector is the logical condition you want (or its negation).

# this is the question's condition, negated
# is.na(mat_x1) <- !(mat_x1 > 0.7)
#
is.na(mat_x1) <- mat_x1 <= 0.7
mat_x1
#>           [,1] [,2] [,3]
#> [1,]        NA   NA   NA
#> [2,] 0.7883051   NA   NA
#> [3,]        NA   NA   NA

Created on 2024-12-16 with reprex v2.1.1

Upvotes: 4

jpsmith
jpsmith

Reputation: 17450

The other answers are great - as a variant given the goals of this specific question, if your ultimate goal is to identify row/column combinations with a value greater than some threshold (i.e, 0.7), you can return the indices of these combinations using which and eliminate the need to manually look at the matrix:

which(mat_x1 > 0.7, arr.ind = TRUE)
#      row col
# [1,]   2   1

If your matrix had row and column names and wanted to get fancy, you could create a little helper function to make everything pretty. Here is an example using a new matrix with fruits as row names and animals as column names:

mat2 <- matrix(c(0.75, 0.75, 0.2, 
                 0.3, 0.4, 0.75, 
                 0.5, 0.3, 0.9), 
               nrow = 3, 
               dimnames = list(c("Apples", "Oranges", "Bananas"), 
                               c("Dog", "Cat", "Hampster")))

myFun <- function(mtrx, thresh){
  indcs <- which(mtrx >= thresh, arr.ind = TRUE)
  data.frame(row = rownames(mtrx)[indcs[, "row"]],
             column = colnames(mtrx)[indcs[, "col"]],
             value = mtrx[indcs])
  }

myFun(mat2, 0.7)

#      row   column value
# 1  Apples      Dog  0.75
# 2 Oranges      Dog  0.75
# 3 Bananas      Cat  0.75
# 4 Bananas Hampster  0.90

Upvotes: 4

Frederi ROSE
Frederi ROSE

Reputation: 351

N <- 3 # Observations
N_vec <- 3 # Number of vectors
set.seed(123)
x1 <- runif(N * N_vec)
mat_x1 <- matrix(x1, ncol = N_vec)
mat_x1[upper.tri(mat_x1)] <- NA
diag(mat_x1) <- NA
mat_x1

new_function <- function( mat_x1 ){
truth_mat <-mat_x1 >.7
truth_mat
newmat <- mat_x1 * truth_mat
newmat[newmat == 0] <- NA
return( newmat )
}

new_function (mat_x1)

This returns :

          [,1]      [,2] [,3]
[1,]        NA        NA   NA
[2,] 0.7883051        NA   NA
[3,] 0.4089769 0.0455565   NA
          [,1] [,2] [,3]
[1,]        NA   NA   NA
[2,] 0.7883051   NA   NA
[3,]        NA   NA   NA

Upvotes: 2

Ben Bolker
Ben Bolker

Reputation: 226557

@RonakShah's answer is better, but for completeness (e.g. if you had a function that was hard to vectorize), you can use apply() over both margins of the matrix:

f <- function(x) if (!is.na(x) & x > .7) x else NA
apply(mat_x1, MARGIN = c(1,2), FUN = f)
          [,1] [,2] [,3]
[1,]        NA   NA   NA
[2,] 0.7883051   NA   NA
[3,]        NA   NA   NA

Upvotes: 5

Ronak Shah
Ronak Shah

Reputation: 389145

In this case, you may just do :

mat_x1[mat_x1 <= .7] <- NA

#          [,1] [,2] [,3]
#[1,]        NA   NA   NA
#[2,] 0.7883051   NA   NA
#[3,]        NA   NA   NA

In case, this is just an example and you want to apply some kind of variation of function y you may do the following. First make sure that your function is vectorized and can handle multiple values which in this case is as simple as changing if to ifelse and then apply the function to the matrix.

y = function(x) ifelse(x > .7, x, NA)
y(mat_x1)

Upvotes: 7

Related Questions