Josh
Josh

Reputation: 85

In R, Replace Values in a Matrix based on the Factor

I am trying to convert parts of a covariance matrix to 0's based on whether parts of that covariance belong to a factor or not. A short example would be a 4x4 matrix that consists of two factors that are made up of x: x1 and x2, x3 and x4.

The following code generates a covariance matrix.

dataframe <- matrix(c(18, 29, 13, 56, 64, 23, 56, 92, 23, 65, 28, 54, 46, 82, 46, 92), 4, 4)
colnames(dataframe) <- (c("x1", "x2", "x3", "x4"))
rownames(dataframe) <- (c("x1", "x2", "x3", "x4"))
cov.d <- cov(dataframe)
round(cov.d, 2)

     x1      x2      x3     x4
x1 368.67  294.67  252.33 414.00
x2 294.67  806.25 -161.50  80.83
x3 252.33 -161.50  409.67 446.33
x4 414.00   80.83  446.33 577.00

Assuming x1 and x2 are part of factor 1, and assuming x3 and x4 are part of factor 2, I would like the output to look like the following:

     x1      x2      x3     x4
x1 368.67  294.67    0       0
x2 294.67  806.25    0       0
x3   0       0    409.67 446.33
x4   0       0    446.33 577.00

I'd imagine the solution would have something to do with a loop and the replace function.

So far I have tried the following with the following parameters set:

num.factors <- 2; vars.per.factor <- 10; num.vars <- 20

for(k in 1:num.factors)
  for(i in 1:vars.per.factor)
    for(j in 1:num.vars) {
      factor.cov <- replace(cov.d, cov.d[i + k * vars.per.factor - vars.per.factor, j + k * vars.per.factor - vars.per.factor], 0)
}

The issue lies within the replace function. Specifically, the cov.d[i, i]. I know [i, i] is not the code needed to do what I need it to do. However, I'm drawing a blank on where to go from here. I'll be playing around with it and update my progress as I go along.

Thank you for your help!

Upvotes: 2

Views: 969

Answers (3)

mmyoung77
mmyoung77

Reputation: 1417

This is how I would do it using only base R. I do not claim that this is the most efficient code -- I tend to write extra commands to make it extra-simple to read, and also, %in% slows down your performance if your list of factors was very long or if the factors themselves had a lot of elements -- but it gets the job done for small groups of small factors (and they don't even have to be the same size).

facReplace <- function(m, f) {
  # f is a list of factors, f1, f2, ..., fn
  # They are combined to make an array called x
  # Also make a data-frame copy of the matrix m 
  x <- do.call("c", f)
  m1 <- data.frame(m)
  row.names(m1) <- x
  names(m1) <- x

  # use %in% recursively to set items in m1 that don't share a factor to 0
  for (i in 1:length(f)) {
    for (j in 1:length(x)) {
      for (k in 1:length(x)) {
        tempfac <- do.call("c", f[i])
        temprow <- x[j]
        tempcol <- x[k]
        if (!(temprow %in% tempfac) & (tempcol %in% tempfac)) (m1[j, k] <- 0)
      }
    }
  }
  return(m1)
}

# Test the function with the original example
set.seed(123)
thedata <- matrix(data = runif(16, 0, 10), nrow = 4, ncol = 4)
thedata

         [,1]     [,2]     [,3]     [,4]
[1,] 2.875775 9.404673 5.514350 6.775706
[2,] 7.883051 0.455565 4.566147 5.726334
[3,] 4.089769 5.281055 9.568333 1.029247
[4,] 8.830174 8.924190 4.533342 8.998250

factor1 <- c("x1", "x2")
factor2 <- c("x3", "x4")
theFactors <- list(factor1, factor2)

facReplace(thedata, theFactors)

         x1       x2       x3       x4
x1 2.875775 9.404673 0.000000 0.000000
x2 7.883051 0.455565 0.000000 0.000000
x3 0.000000 0.000000 9.568333 1.029247
x4 0.000000 0.000000 4.533342 8.998250

Upvotes: 0

shea
shea

Reputation: 528

I have a feeling this has probably been asked before, but can't find any questions with a quick search.

Anyway, the following will do what you want:

setzero <- function(x) {
x[ ((length(rownames(x))/2) +1) : length(rownames(x)), 1 : (length(rownames(x))/2 )] <- 0
x[ 1:(length(rownames(x))/2) , ((length(rownames(x))/2)+1) : length(rownames(x))] <- 0
return(x)
}

> cov.d <- setzero(cov.d)
> cov.d
     x1       x2       x3       x4
x1 368.6667 294.6667   0.0000   0.0000
x2 294.6667 806.2500   0.0000   0.0000
x3   0.0000   0.0000 409.6667 446.3333
x4   0.0000   0.0000 446.3333 577.0000

This works as a quick generic function, as you indicated you wanted. There are probably more elegant solutions.

Upvotes: 0

www
www

Reputation: 39154

You can use the row and column index to assign 0.

m <- matrix(data = c(18, 29, 13, 56, 64, 23, 56, 92, 
                     23, 65, 28, 54, 46, 82, 46, 92),
            nrow = 4, byrow = TRUE)

m[3:4, 1:2] <- 0
m[1:2, 3:4] <- 0
m
#     [,1] [,2] [,3] [,4]
#[1,]   18   29    0    0
#[2,]   64   23    0    0
#[3,]    0    0   28   54
#[4,]    0    0   46   92

Upvotes: 1

Related Questions